Too much data? Never. Too little? Sometimes

David Reed, director of research and editor-in-chief, DataIQ

At a media lunch last week, I was asked to provide an example of where a company might have too little data. The question came up in the context of artificial intelligence and the need for training data sets. If you want to automate a task, you need to be able to show the machine what it looks like, what the outliers are and correct wrong assumptions it might make before going live.

That usually demands a significant quantity of data, which is why most AI projects are currently directed at activities that generate it - online behaviour and personalisation, responding to service enquiries with chatbots and the like. These are resource-hungry and automation yields significant benefits. Just handling the data volumes themselves can be a challenge, which is why a lot of first-stage AI projects are to identify and classify data sets, before applying machine learning to them.

Large, established organisations can see a lot to be gained through the use of AI here. It is also a solution that will generate its own rewards that could not otherwise be unlocked. Sort the data properly so a machine can spot similar-looking things and you increase your capability to handle those things - chatbots mean more chat which means you can offer it across more services and query types, for example.

So who does not have the data to do this? My answer was start-ups, especially pure-play online businesses. They face a blind spot as a result of their data deficit which can be fundamentally challenging to their business model. After all, no investor is going to look at a new company which is not talking data, analytics and AI. Personalisation is the default assumption for the customer experience.

But when your customer numbers start at zero, that is not so easy to deliver. Even the online data streams from anonymous, non-registered, non-converters which have been at the back of all the new platforms are going to become harder to use in the wake of GDPR. It is difficult to prove legitimate interest in processing data about people who have just come for a look. It is even harder to get their consent if they have decided not to register or buy.

Unless you can train the machine, there is a limit to what it can do for you without risking big mistakes. Even where your data supply is virtually unlimited, bad things can happen (think Facebook’s algorithmic news feed), so at a small data scale, the risk of getting it wrong is magnified since each data point has a higher status and may (or may not) be what the algorithm should be aiming at.

One way this might get resolved could be the emergence of off-the-shelf automation and AI which uses previously-proven data and repeating patterns from other companies and industries. In some respects, online conversion is the same challenge whether you are a start-up or Amazon, so deploying a solution based on known types of behaviour could get your new brand up and running. Tuning can happen quickly as data flows build and, before long, you have bespoke models reflecting your specific niche. 

That may seem counter-intuitive in the hyper-individualised realm which AI-supported and data-driven businesses are creating. But if the choice is having no data and delivering a bland customer experience, versus one which evolves from the generic to the personal, it could be a compromise worth making.

Please note that blogs are the sole view of the author and that they are not neccesarily the view of IQ ddg Ltd and should not be interpreted as advice. Please read our full disclaimer

You have....

to be GDPR compliant.

Register with us for all the news

Sign-up to hear about the latest DataIQ news, content and events.