In a way my career has come full circle. Following master’s degrees in mathematics and statistics, I joined GSK as a graduate in 2007 working as a statistician/senior statistician.
My interest in S-Plus and R took me to Mango Solutions in 2011, starting out life as a consultant and R trainer. After six years I left Mango as the head of data science consultancy, having led projects of varying sizes with multiple customers in several industries.
However, I retained a strong connection with pharma throughout and rejoined GSK in 2017 with a bunch of new ideas. I am currently head of statistical data sciences, a start-up-like organisation within GSK’s very large biostatistics department.
Away from GSK I set up, and still lead, an initiative called the R Validation Hub, which is a cross-industry initiative whose mission is to enable the use of R in a regulatory setting. This, along with other collaborative efforts that I am involved with, has enabled me to connect and learn from many like-minded individuals from across my industry.
When I joined GSK, the wider themes and concepts of data science were still pretty alien to Biostatistics. We were reliant on the SAS language for our analysis and reporting and had no way of effectively developing, sharing or reviewing code. At the end of 2018, I started the Data Science User Community and from there we have begun a data science revolution which is transforming the way we work and collaborate.
I really admire what Hadley Wickham has achieved through his tidy-verse concept. He’s looked at what was there and thought, “this is great, but it could be better”. And then he’s gone and made it better. Much better. Historically, Thomas Bayes is a big influence too.
My group began 2019 still in its infancy but with grand plans for how we would transform biostatistics and our analytic capabilities. In a heavily regulated industry that is traditionally resistant to change, I expected to meet a lot of resistance and to find it difficult to deliver on our objectives, but I have been blown away by the positivity of my colleagues. Sometimes the ingredients are already there and just need a catalyst.
I expect that the wider data science revolution will continue, and I see more and more companies making better, more informed, data-driven decisions. Conversely, not everyone is going to get it right. With the investment not always matching expectation, we will probably also continue to see a lot of churn.
We already see some personalisation in marketing through targeted advertising, but I see it spreading more widely to other fields, particularly to medicine where there are some big opportunities to make the most of data collected from wearables.
The biggest technical challenge will be to develop data and analytic solutions that are flexible enough to adapt to future needs. Remaining agile to change presents its own challenges, since our environments are typically subject to rigorous rules and regulations. The trick will be to find a balance that allows our technology stack to evolve, while continuing to meet repeatability and reproducibility requirements.