Big data is attracting followers faster than any data activity in the past. If its appeal is not to end up burning those worshippers, however, it needs to decide whether to stay outside of the existing corporate data management function or join the party, says David Reed
Got big data? Then you’ve got problems. The value of big data is only released when it is analysed and those findings put to work in the organisation. Making that happen is just one of your problems - a report from the McKinsey Global Institute earlier this year highlighted the significant shortfall in data scientists to carry out this work citing a lack of around 140,000 in the United States alone.
Before your analysts can get to work - assuming you have found them, that is - they need an environment in which to explore those big data sets. Finding the data itself is not the problem. McKinsey coined the charming term “exhaust data” to typify the massive data flows from social media, mobile apps and the like. This tends to sit in any system that an organisation’s customers or fans get involved with and can therefore be extracted.
But that creates the next problem - those operational systems are not optimised for analytics. Instead, a specific big data analytical data warehouse is required. And this is where one of the emerging strategic issues can be seen - should companies that want to leverage their big data do so within their existing analytical infrastructure or turn instead to a new generation of tools that sit outside of this space?
The answer each company comes up with may be driven as much by internal politics as it is by pragmatism. IT departments are often working at the limits of their capacity already. With the value of big data analytics still more theoretical than real, that function may want to avoid a potential bear trap by taking on this new realm. Also, the business model for conventional, structured data does not lend itself readily to the exponential explosion of big data.
Michael Hiskey, vice-president, marketing and business development at Kognitio, agrees with the point about when data delivers value, but is wary about where this will happen. “Our stance is that it is not about the size, it is what you do with it that counts. Big data does not equal Hadoop, even though 60 per cent of people we talk to just know about them. They are part of the story, but it is not all just down to them.”
Originally developed by Yahoo! to enable searches within massive volumes of data, Hadoop is now an open-source application that can be developed as a big data tool by anybody who cares to. That in itself is a source of concern to conventional IT departments, since it means a big data solution may not come with the financial stability and development path of standard analytical data warehousing solutions.
Not that this is stopping businesses from rushing towards big data, which is exerting the same fascination as fire does for some people. They know it may be useful and benefit them, but it could also cause harm. Hiskey believes the adoption curve for big data could surpass even that of the Internet, with a potential six-month horizon from consideration to implementation in many cases.
Nigel Sanctuary, head of propositions at Kognitio, believes this rapid rate of adoption is closely aligned to the new infrastructure being considered. “At big data conferences, what I am picking up on is that people don’t know much about it, but they know it is not going to happen in their core operating environment and warehouse. In the cloud, however, it is easier to get going,” he says.
This is one of the critical political decisions that any company wanting to pursue big data has to make. As Sanctuary points out, “databases are not designed for analytics,” whether it is on structured or unstructured data. One of the ways to make sense of big data volumes is by applying machine learning algorithms. “That requires vast amounts of hardware which most companies can’t afford,” he says.
In many respects, building the new big data analytics environment is not that different as a model from the conventional analytics warehouse that Kognitio has been delivering for years. Apart from its scale, it is still a solution that can be on-premise or hosted or in the cloud, then leverage the power of in-memory analytics to rip through high volumes of data.
Hiskey says that a cloud-based solution also removes many barriers to entry. “We’re expending a lot of energy talking about big data for small companies. It is apparent that companies of all sizes are trying to do something with all the data they can get hold of,” he says. As digital natives enter management, their expectations are that data will be everywhere and available for analysis. “Retention rates for data are going to go to 100 per cent,” he predicts.
This already happens at Facebook, which never deletes any data (although this is one of the areas of concern for regulators and may eventually have to change). As this degree of retention becomes ever more possible, it will trigger greater innovation. Hiskey points to PlaceIQ as an example - it has divided the entire world into 100 metre squares, mashes up all the available data on each square and applies a timeline to enable hyper-local actionable intelligence. That business could not exist without big data and a cloud environment in which to process it.
But not everybody agrees that big data automatically means a new analytical warehouse outside of the existing corporate infrastructure. “Big data is very consistent with our active data warehousing philosophy. It is our goal to be the point of integration and analysis for real-time insights to enhance interactions with customers.” as Stephen Brobst, chief technology officer at Teradata Corporation, told DataIQ last year.
That should be reassuring for large enterprises which have been making heavy investment into their data management and customer insight functions over the last few years. Proven value has emerged from those activities. If big data sits under the same umbrella, it could help to encourage senior executives that it, too, will end up showing a strong payback.
For many vendors in this space, however, big data is an opportunity, rather than a threat. “Is this new for us and a new direction? We are geared up to look at it as business as usual,” says Mark Dunleavy, sales director UK at Informatica. “For us, this is just about new data sources.”
Companies may have been saying for decades that the customer is king, but until recently they have had only a limited view of those customers. Social media and mobile apps are opening up new realms of behavioural data that provide an opportunity to gain much deeper insight and to interact with those customers in new ways.
“Our older, mature customers are looking at big data as an extension of their data management strategy. It is ending up in their data warehouse or branches of their CRM systems,” says Dunleavy.
Even so, migration to the cloud for big data analytics does represent a significant new market and Informatica already has over 20 cloud deployments. Typically, these are not entirely stand alone, however. “They are coming to us because they want the ability to connect their traditional data sets which are on-premise with the new off-premise data,” says Dunleavy.
Behavioural data on its own may be useful for some specific functions, such as website optimisation and customer experience, but it becomes much more powerful and an enterprise-wide asset if it can be linked with everything else that is known about the customer. Big data may have volume, but its value is likely to be released through integration, not isolation.