It is not often that metadata is headline news. Revelations about the data surveillance capabilities of America’s National Security Agency (NSA), however, brought a previously-hidden aspect of data management to the attention of a mass audience. While most will have cared little about the technical dimensions of this activity - as opposed to its political implications - the news will have piqued the interest of a group of business executives. Anybody who was previously struggling to demonstrate the power and potential of master data management (MDM) to their organisation will immediately have found a useful reference point.
“The NSA revelations have opened a lot of eyes,” says Dennis Moore, senior vice president and general manager, MDM, Informatica. While noting that, “there is a level of irony about different governments accusing each other,” he welcomes the elevation of this back office activity to front-page status. As he notes: “Most people can’t even spell MDM.”
Put simply, the spy agency’s trawling expeditions through phone, email and internet records are not dissimilar to the way commercial organisations need to identify what data resides in their own systems, albeit at a different scale. If a business needs to be sure it is compliant in the way it handles personally-identifiable information, for example, it will need to ensure it has clear data definitions in place that allow multiple instances of a record to be integrated with confidence.
Another effect of the NSA exposure will be on data transfers between the European Union and the United States. That will put pressure on cloud services providers which Moore notes have just enjoyed a two-year boom.
“A couple of years ago, CIOs didn’t trust the cloud - now it is very difficult as a start-up to get funded if you are not cloud-based. Even if companies may not be planning a fully-fledged migration into the cloud, they are looking at it for some functions,” he says. Some business critical cloud services providers have pre-empted this issue. Salesforce now has a data centre in the UK, although not in France where the rules for handling personal data are tighter.
Says Moore: “There are some major differences between Europe and the US, such as the higher level of cloud services adoption in America. At the recent Informatica World event in Las Vegas, I asked if clients were using MDM with Salesforce.com and 90 per cent of US delegates said yes, against only 20 per cent of those from Europe.”
The ability to ensure data is consistent across both on-premise and cloud-based applications has been a challenge, not least because of the difficulties presented in trying to implement data mastering tools on third party systems. By partnering to ensure its MDM system can operate in a cloud-based environment as well, Informatica has helped to close a gap.
It has also been fortunate with the timing of its data masking solution. “We have a technology to mask data and prevent access by people other than those authorised, so for example an account manager can see a phone number, but others can not, even if it is stored in Salesforce.com,” says Moore. “If you can see data on a screen, it is not secure.” This system was brought to market just ahead of the deliberations in Brussels over the new Data Protection Regulation, which is likely to make this type of masking part of new compliance routines.
For Moore, the real issue is not one of technology or legislation. “What we have to decide in society is how much convenience we want compared with privacy and security. What balance will be acceptable? We can’t have total convenience and also be totally private,” he says. At the moment, there is a tension between the what is free online and what needs to be secured.
“The problem is that there is a culture among some young people that it is everybody’s information and that ‘information wants to be free’,” he argues. “Technology can only take you so far - you also have to train people to respect data’s value and view these tools as empowering.”
The upside of this new culture can be found in new open sources of information and in the possibilities of crowd-sourced information, which Moore strongly advocates. “Waze is a great example of crowd-sourced data,” he says. “It doesn’t require every road to be hard-wired to provide traffic updates, because every user is the sensor commenting on traffic conditions. So its maps are more accurate than Google’s.”
He cites as an example the three main roads he uses in California - the H101, El Camino Real and H280, all of which are classified the same way by Google, even though the Camino is highly-regulated with traffic lights and cars on the H280 can usually travel 30mph faster than on the H101.When Oracle’s staff all head home using the latter road, it becomes gridlocked, which Waze users can see and avoid.
One issue this will present will be the authoritativeness of any piece of data. Moore believes that current commercial data providers are unlikely to exist in five years’ time, put out of business by newly-opened transactional data sources and crowd-sourced validation of data, both at a personal and commercial level.
“Who’s going to make sense of all that data? That is what Informatica exists to do. It can connect to any type of data, including sensors, and bring it all into a single data model regardless of the source format. Then it cleans the data up, integrates it, runs it through information lifecycle management and data security processes,” he points out.
While personal information may be the headline act in this new data world, there are other opportunities for commercial benefit which bring significant MDM challenges in their turn. Says Moore: “A very big Wal-Mart store may stock 150,000 SKUs - online there are 8 million items. Your product manager has to be able to manage that long tail and to provide all of the information a customer needs for their searches, as well as what type of box it ships in, the number of items in the box, etc. That is driving adoption of product information management.”
In this retail world of the “endless aisle”, manufacturers are being required by distributors not just to deliver goods where they are needed at the right time and to an agreed price, they also have to provide all of the data which goes into those online systems. “If you are supplying to Amazon, for example, they have got a lot of product filters and you have to provide them with information that fits into that hierarchy, as well as photos, videos, etc,” he notes.
Mastering data across the supply chain, as well as across partners, employees and customers, is all serving to push MDM higher up the corporate agenda as its commercial implications get recognised. All of which is keeping Moore busy - he has racked up 250,000 frequent flyer miles on United Airlines travelling between the US, UK, France and Germany - and is also proving that you do not need to be on the hunt for bad guys to need your data to be in good condition.