Poor quality health data hinders analysis

Toni Sekinah, research analyst and features editor, DataIQ

Prescription of asthma medication has increased by 17% in six years while the prescription of antibiotics has decreased by 12% in the same time period. These findings were made despite using poor quality public sector data.

Stethsoscope doctor's notesThe research was carried out by Polymatica, a business intelligence and data science company, which was looking to find out if there was a link between external factors such as air quality or socio-economic status and the number of prescriptions. However, the researchers who were looking at government GP data were hindered by poor quality data, in which fields such as address and city were varied and inconsistent with spelling mistakes and abbreviations.

"We've been hindered by the quality of the data."

According to Mark Hinds, the chief executive of Polymatica, any conclusions were hard to come by as poor data quality limited the ability to understand the possible root causes of the discrepancies. He said: “We wanted to investigate the factors behind the findings, but we’ve been hindered by the quality of the data. What we’re able to see is that the message about reducing the number of antibiotics being prescribed is largely getting through. However, we can’t dig into why this has decreased, or why the level of asthma medication has risen.”

The inconsistencies in the data input suggested that there had been free data input rather than the use of drop-down lists. This made it difficult to match up and aggregate the datasets and to connect them with pollution and socioeconomic data. Therefore, the data analysis was unreliable. Similar data quality problems have been identified in mental health patient records.

Pills and thermometerDespite the issues with the data, the researchers did find that GPs in Yorkshire prescribed 4.2 million items of asthma medication and 2 million items of antibiotics in 2017. In contrast, GPs in London prescribed 2.9 million items of asthma medication and 1.8 million items of antibiotics in the same year, even though London has a larger population and more GP surgeries.

"Poor data quality harms results and creates inconsistent results."

Hinds went on to say: “Ultimately, poor data quality harms results and creates inconsistent insights, We can see Yorkshire is an anomaly in the UK, but we can’t begin to understand why that might be, and certainly couldn’t create a national strategy around what might be bad insights from bad data.”

The CEO did encouragingly give an example of excellent data quality in the health sector. Hinds said: “The government is taking the right steps on data quality but must do better. A good example is the British National Formulary for drug classification, which has allowed the NHS to collect information on prescriptions accurately. This is a great example of the gold standard in data quality that the government should be aiming to apply across the board to ensure it gets accurate insights.”