Primordial gravitational waves have been detected by scientists from Harvard-Smithsonian working at the South Pole. If their evidence is accepted, it provides proof of the Big Bang and support for Einstein’s Theory of Relativity. That is big science at work and it is based on big data. It also offers an insight into how commercial data users need to approach the new data streams that are becoming available and points to one significant hole (black or otherwise) in their approach.
Immediately after the explosion that created the universe, everything in it expanded rapidly in a process the astro-physicists call inflation. Waves of energy became converted into the microwaves that have been detected using the BICEP telescope (Background Imaging of Cosmic Extragalactic Polarization).
Unlike looking at lightwaves, however, the latest finds had to be carefully extracted from the raw data sets captured in the pristine conditions of Antarctica. Scientists had to remove layers of “interference” caused by gravity as the microwaves travelled across the Universe. That meant applying models to identify known energies so that, by the end, all that was left were the distinctive swirling patterns created in the first micro-seconds of time.
These are the proof which the project went looking for. The whole effort has been informed by Einsteinian theories which have helped to direct the scientists towards what they expected to find. Just as CERN has been looking for the Higgs-Boson, these data scientists anticipated finding a signal buried in the noise of 14 billion years of space-time.
So what is the lesson for commercial big data? It is not about the need to capture and trawl through massive volumes of unstructured data - that is just the starting point for any project, be it scientific or commercial. It is only partly about having models that allow the bulk of that data to be sifted and organised into known entities. Critically, it is about having an over-arching theory which provides the organising framework for the project.
Consider social media analytics. At the moment, most big data efforts are focused on understanding either the connections between members of a network or the way in which messages (such as tweets about brands) are distributed across that network. Considered from the perspective of what BICEP has been doing, this is Newtonian physics - proving the gravitational effects of being a member (or outsider) of a social group.
So far, so good. But this approach is limited - once you can see these patterns in the data, you start to realise they are constantly repeated. The ability to influence or change them is limited, but not necessarily by the gravitational model in use.
Instead it is the lack of any powerful social marketing theory which is preventing greater progress. Why is it that consumers are attracted towards the same types of message over and again? What are the drivers of the behaviours which can now be tracked through big data? And how can those atttractions and behaviours be influenced at a micro level in favour of a brand?
As far as I can tell, nobody is really trying to put together a consistent, sustainable theory that would deliver real benefits in the long-term when provided with the right proofs. Instead, practitioners are still star-struck by the sheer scale of the data available to them. (How many times have you heard the same unsubstantiated claims made about the volume of information created in the last two years compared to the rest of human history?)
To progress, it is time to start thinking about the really big picture and just how little of it we currently understand.