Dataiku: open data test predicts London crimes

David Reed, director of research and editor-in-chief, DataIQ

An experiment to prove the predictive power of data has mapped trends in crime for London neighbourhoods in 2017. Using open data from police services, data science software provider Dataiku worked with mapping and analytics vendor ESRI to visualise where crime is likely to rise or fall.

Using a true data science approach, “I didn’t know if I would find something predictable,” senior data scientist at Dataiku, Nicolas Gakrelidz, told DataIQ. “It was a test to see if it would work. I designed it to build a predictive model on open data with no precise idea of what I would find.”

Nikolas Gakrelidz, DataikuGakrelidz is in charge of technical patnerships and third party data at the company, including open data. “The UK is more transparent than other countries with the police providing anonymised crime data using GPS co-ordinates,” he pointed out. This was mapped to Census output area level which ranges from 1,000 to 5,000 properties. For each, a prediction has been created that is visualised as a coloured bubble - the size shows the number of crimes predicted for 2017 while the colour indicates if this is an increase or decrease. The result can be seen here.

The underlying model is built around three core data elements: reported crime data between 2011 and 2015, the type of crime (from shoplifting to violent assault) and nearby locations of interest (such as pubs, restaurants, shops, bus stops). “In downtown London, you find a lot of shoplifting, but less violence, so you will see a declining trend compared to other areas where you are seeing an increase in violence,” explained Gakrelidz. 

Building the map is an extension of work he carried out in 2016 and part of a current trend. “We did a live project with the police in one UK county to optimise its response. We know there is an intention at local level to adopt this type of approach,” he said. Elsewhere, Accenture has been working with several police services on predictive crime mapping to improve their intelligence-led policing. “There is a lot of interest in this because of how budgets are under pressure.”

A more detailed explanation of the model was given by Gakrelidz at a recent meet-up on advanced analytics for public services. “It is about explaining what can be done when the data is available. The map gives a flavour of what you can expect to deliver,” he said. The UK’s lead on open data is an important component - Gakrelidz noted that less data with fewer details is publicly available in France, although there have been recent efforts to change this.

He also argued the importance of bringing the test-and-learn principles of data science to bear on the problems faced by the public sector. “It is a different mindset - there are no good or bad ideas. It is about the approach if you want to optimise something,” he said. With crime prediction, more detailed data would be required to prioritise policing and response, but the test shows the potential power of applying data and analytics to this issue.

Director of research and editor-in-chief, DataIQ
An expert commentator on all things data, David has been editor of DataIQ since its inception in 2011.

Sign-up to hear about the latest DataIQ news, content and events.