Statistics 1: eHarmony 0. But what does the result mean for big data?

David Reed, director of research and editor-in-chief, DataIQ

Has a recent ruling by the ASA put the entire practice of data story telling on notice? You might think the issue of how an online dating service markets itself has only limited relevance. But if you consume the output from data analysts via well-crafted presentations, you should check the fine print before making a decision, or at least ask some very direct questions.

The background is relatively simple - eHarmony makes great play in its advertising of the scientific basis for its matchmaking. Its algorithm is built on a study of the personality traits and values of 50,000 married couples in 23 countries. Models have been built that score the compatability of members based on their answers to a detailed questionnaire with the aim of getting the best possible fit between partners.

In an era of swipe-right instant gratification from dating, it seems the lure of long-lasting love is still strong. But in emphasising the “brains behind the butterflies”, it seems eHarmony may have overreached itself. I commented before on the way it had conflated chemistry and data science, and it seems I was not the only one to be sceptical. 

"This is a new form of fake news."

David Lipsey, joint chair of the All-Party Parliamentary Group on Statistics, lodged a complaint with the ASA last June on the grounds that, “phrases like ‘scientifically proven’ should be confined to claims that are just that, not used in crude puffery designed to lure in those longing for love. This is a new form of fake news.” The ASA agreed and has ruled that the ad was misleading, despite the dating service’s submission of several academic models to support its defence.

What makes this important to anybody relying on data and analytics is the basis for that ruling. Firstly, it noted that the data used was skewed towards eHarmony users who had been incentivised to tell it about their successful relationships. This meant there was no proper comparison with the outcomes from people who met their partner through other channels.

Secondly, the scores for marital satisfaction were described as not statistically significant within the very study used to support eHarmony’s claims.

These are the very points which any analyst needs to be quizzed about when providing business-critical insight. Is there a universe (or control group) against which to compare the test group in order to understand how meaningful any behaviour actually is? And does that behaviour have real significance or is it within anticipated boundaries?

There is no more exciting story to tell than meeting and falling for the love of your life - that is why services like eHarmony thrive. But just as when putting your heart on the line, staking significant business resources without understanding the real odds is risky. So listen to the story first, then check the numbers it is based on. 

Please note that blogs are the sole view of the author and that they are not neccesarily the view of IQ ddg Ltd and should not be interpreted as advice. Please read our full disclaimer

Knowledge and strategy director, DataIQ
David is developing the framework for soft skills and career development among data and analytics practitioners. He continues to be editor-in-chief and research director for DataIQ.