In November, during the 2023 DataIQ Conference, Simon Case, head of data, Equal Experts, hosted a roundtable for data leaders to examine the implementation of machine learning operations within data tech stacks.
“While the excitement around gen AI grows, ‘classic’ AI machine learning (ML) is still creating great value for businesses and the DataIQ community still see it as a critical tool for extracting value from data in their organisations,” said Simon Case, head of data, Equal Experts. “We had full tables eager to discuss their challenges in implementing ML. There were great discussions from representatives of organisations at different stages of the ML journey, from someone starting their first ML team to mature ML developers deploying many models into production.”
Avoiding lock-in
When it comes to technology and new tools, it is easy to be swept up in the excitement and hype cycle of new developments, but there must be considerations about which platform(s) should be selected. For example, some tools are complex and require too many skills to be used effectively across an organisation.
“A challenge with selecting a commercial platform is that there is a risk of getting locked-in to vendors,” explained Simon. “Often, there are no easy ways to transfer models across platforms and the needs of the business will invariably evolve which risks outgrowing or moving away from the specialisms of the selected tool. Participants of the roundtable noted that the use of auto ML capabilities meant the lock-in risk was even higher.”
A few roundtable participants explained to the group that they had found tooling that worked for them, but they noted that the real challenges were the gaps between proof-of-concept and something fit for production. Businesses at different stages of their data maturity journey struggled with different aspects of the production lifecycle and this was partly down to these identifiable gaps that are left by different data platforms and tools. Most of the table agreed that data platforms are far more adept at the proof-of-concept aspect compared to the production portion.
Sourcing the data needed
Part of the issue implementing ML ops into a tech stack is being able to find and utilise the data sets required for success. This was a common hurdle faced by members of the roundtable as this task would usually be required at the start of a project when there was minimal capacity and an eagerness to prove proof of concept to decision makers.
There were calls from the roundtable to create data catalogues and dictionaries for internal use, as well as to improve the storytelling and data literacy capabilities of the team. Some felt they knew already where their data was located, with one business representative explaining how their organisation had been actively migrating from an on-premises system to a cloud platform. This transformation had taken over two years and was difficult, but because the data team were so involved in the process they know exactly where the data sets they require are and the lineage of the data.
A smaller, but by no mean insignificant hurdle faced by the group was that of team member churn. When an established member of the team left, there would often be a large knowledge gap in their absence that made tasks such as collating the right data slower. An issue found with higher churn is that the response times to problems are slower and the ability to fix the problems before they scale is reduced, and this is arguably something that ML will not be able to address without the human skills behind it.
Thank you for your input
Thank you for your feedback
DataIQ is a trading name of IQ Data Group Limited
10 York Road, London, SE1 7ND
Phone: +44 020 3821 5665
Registered in England: 9900834
Copyright © IQ Data Group Limited 2024