Government is putting money - and the force of law - behind the open data movement. As a result, commercial organisations are having to look at releasing data sets, just as the public sector has been required to do. David Reed looks at what this might mean for data controllers.
Open data sounds like one of those ideas everybody agrees with in principle, but seldom acts on in practice. Like keeping to the speed limit, the benefits seem clear - until you are in a hurry to get things done, when it can seem like an obstacle to progress. Yet advocates of open data see it in precisely the opposite terms - current restricted data policies inhibit progress which would be better enabled through allowing open data to be released.
Government thinking is clearly moving towards open data, with the creation of the www.data.gov.uk website releasing many data sets freely and a £10 million investment into the Open Data Institute (ODI). This latter initiative could be critical to the success of the open data movement itself. While there may be more benefits to be realised from existing data, there is a far bigger target to aim at in the expansion of data usage of all kinds.
To that end, the ODI celebrated Open Data Day (yes, it really exists - 23rd February) with the announcement of a £850,000 programme to encourage SMEs and start-ups. The Immersion Programme is a kind of mini X-Factor for data mash-ups. It will comprise a series of nine-month seasons with a specific vertical focus (the first being Crime and Justice).
During this period, groups of experts and participants come together to shape the challenges, followed by a development phase in which new ideas for solutions are worked on. A creation and innovation weekend follows when these products are developed, tested and judged. Successful participants can win up to £25,000 in seed money and also be included in the ODI incubation programme. After review, funded ideas will then get access to further support.
ODI’s CEO, Gavin Starks, said: “This new programme is a great way to help ideas emerge. All new businesses are hard work and open data businesses have their own specific challenges. We will help shorten the path between ideas, experts, and funding: applying structured and rapid feedback within a time-limited programme that will help surface and refine those which show the greatest promise.”
It is clear that open data advocates are convinced that openness is in itself a spark that will ignite a new wave of data innovations. That is certainly a belief of Ctrl-Shift, which is currently researching a white paper on what happens when open data meets personal information.
It has already identified one issue which could limit the extent to which openness happens - the algorithms that are an essential part of understanding data. As the organisation notes on a recent blog, algorithms are proprietary and closed, which leads some to try and second-guess how they work. “There’s already an entire mini-industry out there devoted solely to understanding and gaming the algorithms that determine what answers Google search engines serve up, for example,” writes Ctrl-Shift.
Challenges to that model are starting to happen, as with companies contesting where they have been listed on search results by Google. What seems likely to happen in the short term is an increase in black box technologies to offset the rise of openness in data - commercial companies need to keep something of value from their investment, after all.
Balancing open data with commercial reward will become a new consideration for most major data controllers. ODI recently consulted on a guide to making this business case and will shortly publish the resulting framework.
“People that hold data have to have an incentive to make it open,” says Jeni Tennison, who is leading the project at ODI. “To do that, they have to have a business case to sell upwards in their hierarchy or to meet our human need for a rational explanation of why we are doing something.”
Evidence of the need to put this rationale in place can be found in the public sector where organisations have been mandated to consider data for open publication. “The big problem is the way it has been done - it has been a top-down directive. That means the people who do it don’t think about the best way to make data open - they just look to get it out quickly and easily. That can potentially cause problems for the users of that data which might not be accessible or easy to use,” says Tennison.
As with any data set, maintaining it and ensuring it is up-to-date is critical. Open data will only yield its value if sources are reliable and sustainable. Equally, innovations are only likely if developers believe that that data source will continue to be around - few will want to build services that become one-offs.
“To stimulate the demand for open data and help business to get on top of it, we need to ensure that supply and ensure data is provided in a way that is easy to use and has structure and meta-data - all the things a business user needs,” says Tennison.
This is an important issue on the supply side of open data since many organisations will never have been producers for external consumption, only for their own internal needs. If a simple, consistent format and frequency of release can be established, it will minimise the level of customer servicing required.
Tennison is critical of the fact that little attention has been given to the mechanics of how an open data market will work. “In the open data legislation, not much is said about the systems needed to open up data and how to run them, like the fact you need extra information that enables a user to do clever things with it.”
Presently, there is no commonly-agreed standard format for the provision of open data. That is one aspect which is at the heart of the midata initiative which, with Government backing, will do much to stimulate the demand side. It talks about machine-readable formats for data - agreeing a standard will go a long way towards removing uncertainty about how telcos and energy companies should set up their databases to release information to consumers.
Advocates of open data are convinced that the release of more information is a good thing economically, socially and even for the environment. A lot of effort is currently going into making the business case to ensure that the supporting legislation gets turned into genuine action. For commercial organisations being looked to for demand as well as supply, drawing on that energy and commitment could help to persuade any doubters who remain unconvinced.