Like many early day data scientists I have always been driven by curiosity, innovation and technology; this and a passion for mathematics. At first, I saw my calling within academia. As I matured as a scientist and completed my PhD in mathematics, I realised that there is a strong entrepreneurial side to me, too. I consequently co-founded a start-up with friends in the energy sector with the purpose of using emerging cloud technology to deliver data products with the humble ambition to transform the entire utility sector. Dream big and start small. We celebrated a successful exit with Enechange.jp, a utility comparison platform, which is now the market leader in Japan.
Today, data transformation resounds throughout every industry but back then it was eye opening to see the power of data and machine learning on the emerging smart energy sector. This became a recurring theme of my later jobs where data and machine learning continued to make a real impact on both the businesses and the consumer experience alike. I built and managed data science teams for the Rank Group PLC and Zoopla before I established my own consultancy, DataSonic. I am currently working with Trainline on some exciting new data driven innovation projects.
No career in data science is the same or ever easy and I am very proud of my entire journey so far. I like a challenge and therefore I had my fair share of mistakes and failures, too. I am honoured and humbled by the interest in my work, experience and opinions from the wider community. I never expected anyone to read my blog posts and I most certainly didn’t expect to feature in a documentary about data science pioneers by Dataiku.
As a data scientist I stand on the shoulders of a best in class open source community and I draw a lot of inspiration and know-how from that active community. However, much of my initial thinking around commercially successful data science at scale was based on books and articles by Ted Dunning and Ellen Friedman, who are my personal data science heroes.
In 2019, the number of companies proved the transformative value of data continued to grow. But, most importantly, there is an emerging standard of best practice, platforms and toolkits which significantly reduced the barrier of entry and price point of a data science team. This has made data science more accessible for companies and practitioners alike. There are now cross-functional teams working on algorithms all the way to full-stack data products with a focus on research, commercial applications, experimentation, interpretability, algorithmic fairness and data ethics. In 2019 data science finally started to switch from hype into a more pragmatic, value-focused delivery mode.
Despite all the progress in 2019, though, data specialists struggle to find jobs. In the past ten years, 85% of big data and data science projects have failed to deliver business impact and many teams have been discontinued as a consequence. While the field and industry has learned a lot from that hype of inflated expectations, we are still in the early days of a cautious reinvestment phase. While there is no shortage of data scientists, many businesses struggle to find the qualified leaders and managers who can blaze a more successful and sustainable trail this time around. This might hold back investment into data in 2020.
Data science and machine learning models have greatly reduced the risks and costs of B2B and created entirely new products and revenue streams for B2C. The hype has increasingly proven itself and recent progress in AI is reaching a breakthrough point from R&D to real world applications.
Unfortunately, data science has benefited B2B and high-tech companies the most so far. The biggest opportunity lies within the ongoing democratisation of data science and data technology to redefine the value exchange for all that data more favourably towards the consumers and the wider society.
It is important to remember that data science is constantly evolving from ongoing innovation within science and tech alike. However, this is nothing new and today’s challenges of technical debt, inadequate data infrastructure and production deployment of models have been around as long as data scientists themselves. I do see the biggest challenges emerging from the accelerating speed of innovation in the data space. It becomes increasingly challenging for businesses and practitioners alike to stay up to date and future proof their data investments.