The volume of data about all of us as individuals is exploding at a huge rate. Business hasn’t been slow to recognise this and subsequent growth in analysis, segmentation, targeting and technology is now beginning to take hold. The benefits to business and the wider economy should be huge – according to research by MGI and McKinsey's Business Technology Office, big data can generate value in many sectors.
For example, a retailer using big data to the full could increase its operating margin by more than 60 per cent. The report goes on to say that the public sector in the developed economies of Europe could save more than €100 billion in operational efficiency improvements alone by using big data, not including using big data to reduce fraud and errors and boost the collection of tax revenues.
So the drive to adopt big data has some pretty big and powerful numbers behind it. In fact, this rise of social media interaction and online data – leading to a doubling of data assets within two years – means companies will be forced to embrace big data or face the threat of being left behind by competitors, according to our latest DataIQ Big Data research. Within a year, social media data will be the dominant type in use at 65.7 per cent of companies, closely followed by online marketing data (54.7 per cent). Volume is also a factor, with the mean growth rate of data volume reported at 38.9 per cent.
The head of John Lewis’s online operation has said the retailer no longer sees analysing customer information as “crunching data”, with the rise of big data making analysis simply a “normal way of working”. Aleem Cummins, release manager for johnlewis.com, said that by being able to take advantage of large-scale datasets which the site generates, John Lewis and its customers are seeing major improvements in its data strategy.
Cummins said: “In the past, I might have said 'crunching data', but now it seems more natural, it's just like an extension of what you do. So we do our analysis, but it's just something that we do that doesn't feel like you're crunching data. It just feels like this is the normal way of working now,” describing it as a "natural evolution to the new way of working".
The company recently implemented a system from US firm Splunk which enables it to search, monitor, and analyse machine-generated large datasets. Cummins said: “The more that goes into it, the more 'gold' that falls out the other end and, for us, the gold is the customer experience.”
"We're bringing in data that was always in different places, speaking in different languages. We're bringing it all into the ‘mothership’, and that's going to allow us to get a better view of the landscape, and give us opportunities to make improvements," he said. "Hopefully, our productivity will be better and our customer experience will be better.”
It is perhaps obvious that improving customer service, better targeting, greater efficiencies and cost reduction are all ultimately in the best interests of the consumer. But there is a problem. The problem is as big as big data itself and it’s the very personal relationship between individuals and their privacy.
To a greater or lesser extent we all value our privacy – for some, this extends to wanting very strict control over what others know and understand about them, for others, it’s more of a commodity to be knowingly exchanged for better deals, better service or fun. Regardless of the size of our own privacy “bubble”, we need to recognise that a boundary does exist between what we want others to know and what we don’t.
It’s much more than a desirable thing – Article 8 of the Human Rights Act states : “(1) Everyone has the right for his private and family life, his home and his correspondence.”
From this fundamental right many other laws and controls have stemmed – currently UK legislation in the form of the Data Protection Act includes prescribed limits on how we gather data, what we can use it for and how long it should be kept.
The tension in the relationship between Big Data and the Data Subject is really down to the ability today to combine multiple, huge, disparate data sources in ways that were perhaps never intended. For example, the well-documented case involving retailer Target in the US is revealing. Newspaper headlines focused on an angry dad in Minneapolis who stormed into a Target store demanding an explanation as to why his teenage daughter had been sent ads for maternity clothes and nursery furniture. She wasn’t pregnant – or was she? And how did Target know before her own family?
This isn’t the only example - social data today is being increasingly incorporated into retail contact strategies without enough thought for the privacy consequences. These could be much more than upsetting consumers - the appetite for punitive action by the ICO is seemingly on the increase. The recent, more stringent, PECR guidelines also hint towards a less tolerant regime. We wait with interest to see how the proposed EU General Data Protection Regulation evolves…
So what specifically should we be concerned about right now? There are (at least) 3 specific areas of concern within the existing Data Protection legislation that may undermine the basic approach of big data programmes. The first is Principle 2 of the Data Protection Act: “Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes.”
Is it reasonable that when I post an update on Twitter or Facebook the update is then used by businesses to refine my profile for targeting? For seasoned data professionals and “switched-on” data exchangers this might be expected behaviour - but for many this simply cannot be true. According to a Deloitte report, “Data Nation 2013: Balancing growth and responsibility”, just 35 per cent of people are fully aware about how their information is collected and used by businesses.
Indeed, the report suggests that the collection of new types of data from sources including smartphones and social media is to blame for the fall in awareness of about how data is used to personalise offers and services. The matter of specificity is of vital importance here and is probably roundly ignored when it comes to the aggregation of multiple sources for big data applications. This must be investigated and rectified to ensure that the data underpinning these applications can actually be used.
Secondly, Principle 3 of the DPA: “Personal data shall be adequate, relevant and not excessive in relation to the purpose or purposes for which they are processed.” When an individual posts social media data, or buys retail products, or simply browses the web, do they realistically expect that their data is being used for profile building or targeting?
It may be the case that an individual’s permissions have been gathered for parts of the data. But aggregating data in this way doesn’t necessarily mean that those permissions are still valid. In reality, how transparent is this to the individual? The aggregation of large volumes of personal data probably leads to a situation where the individual may regard the aggregate to be excessive for the original purpose.
Thirdly, Principle 5: “Personal data processed for any purpose or purposes shall not be kept for longer than is necessary for that purpose or those purposes.” Data retention is a hot topic - data does need to be retained for many valid reasons, such as proof of purchase, tax reasons, legal holds and many others. However, according to ICO, ”you should not hold personal data on the off-chance that it might be useful in the future”. This appears to fly in the face of building large, aggregated pools of data for analytical purposes. De-personalisation may be part of the answer, but the guidelines here are also being tightened beyond the norm for many.
Real big data programmes clearly have some massive potential benefits and some huge potential problems. So how do we make this troubled marriage of opportunity and privacy into a stable long-term relationship? The answer is straightforward – our data relationships with individuals must be clear, transparent and give control to the data subject. The point is that the individual must really understand what is happening and it's unclear that current approaches to data protection really do that.
The good news is that, as public perception of these practices increases and the concept of a value exchange between the data subject and organisations becomes commonplace, the reasonable expectation of use will change – perhaps leading to more straight forward privacy regime. Until then, this relationship will continue to be a volatile and rocky one.