Get teenage kicks (with adult insight)

ao link

Members

Contact

Free AI assessment

New to DataIQ?

Take our FREE data literacy indicator now

Unlock the power of data - take our FREE data literacy indicator now

Big data is like teenage sex: everyone talks about it, very few really know how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...

I realised recently I’d been doing it full-on for over 20 years, according to one definition, and patchily for 10 years using another. All the while learning from my mistakes and understanding what makes it really great. It’s probably worth pointing out the analogy sadly finished in the last paragraph.

So, what are the competing definitions? Let’s start by looking at the two main strands. Firstly, the technical, which is generally recognised as collecting, storing, processing and analysing vast amounts of data on lots of cheap hardware, for example Hadoop running on multiple Linux PCs. The size is key here, as this volume of data will not fit on current relational database systems. The second strand is the marketing/press aspect. This is, in essence, data analysis on any size of data. Statisticians/OR/data miners/predictive analysts/data scientists and others have been doing this for years, only now it has a trendy new banner.

These different definitions often cause confusion when they get mixed up. The good news is that, as long as you’re aware of the two definitions, you won’t go far wrong. The even better news is that the keys to successful big data are the same for both definitions. So, I hear you ask with baited breath, what are the keys?

Number one, have an actual business problem to solve. For instance, if you want to reduce churn in banking, make sure the churn is clearly defined (eg, less than £100 and inactive for more than 90 days) There are always business problems to solve, the key is in identifying the right one. Secondly, make sure there are the ways and means to enact the solution. If you’re detecting unprofitable customers, is there a desire to address the root cause of the unprofitability?

This can be tough as it will require a business change and quite possibly a cultural shift for your organisation. It’s not unknown for this stage to be ignored or left until the end, but without proper sight of the business change early on, it’s easy to go in the wrong direction and end up with an accurate, but useless piece of analysis.

Thirdly, ensure the results of any solution are measured. One of the main benefits to this approach is that it’s easy to determine which solutions are working well and which aren’t, thus enabling a test-and-learn methodology to be adopted. The other benefit is it becomes far easier to highlight the success and return on investment of big data to key stakeholders. This is relatively easy to achieve if it’s planned from the outset. Retro-fitting measurement rarely works properly.

The final piece is the data. This is the foundation and building material for any big data analytics project. And it’s almost always the toughest part. This is partly due to perception, as most people’s experience of collecting data is in Excel. And that’s easy, right? The reality is the complete opposite. This has a number of causes - chief among them is the complexity of the systems the data is sourced from. In essence, the more complex your IT estate, the more costly your big data solution will be.

The other big factor is the quality of the data (think about call centre reason codes) and the meaning of it. Quality is paramount, not quantity. I have built a highly accurate model of corrosiveness of acids on human skin using just 28 records, the data being created by a highly-skilled research chemist. I’ve also seen rubbish come out of millions of records, mainly because the data was patchy and not properly understood.

My experience has shown me that, in essence, you need to understand your goals, plan for the outcome, measure your results and go for quality, not quantity. If only I’d known then what I know now…

Log in to read the entire article

Gain access to the entire article by logging in or registering for a free account here.

Did you find this content useful?

Thank you for your input

Thank you for your feedback

Next read

Data Literacy versus Data Culture – DataIQ’s view

DataIQ explains the differences between data literacy and data culture as understanding the differences is essential to achieve buy in and support from business leaders.

Next read

A case of the AI biter bit?

23 Apr 2024by David Reed

DataIQ’s Chief Knowledge Officer and Evangelist, David Reed, examines the hype cycle around generative AI and the actual speed of transformation being seen.

Pioneering AI initiatives revealed: DataIQ Announces 2024 AI Awards Shortlist

15 Apr 2024by Alex Roberts

The shortlist for the 2024 DataIQ AI Awards has been unveiled, with the winners to be announced at the DataIQ Summit on May 21.

Final chance to enter the 2024 DataIQ Awards and demonstrate your team’s prowess

08 Apr 2024by Alex Roberts

The final deadline for submissions to the 2024 DataIQ Awards – 26 April – is rapidly approaching, so make sure you have entered to clinch a title.

You may also be interested in

International collaborative AI safety agreement signed

DataIQ is a trading name of IQ Data Group Limited
10 York Road, London, SE1 7ND

We use cookies so we can provide you with the best online experience. By continuing to browse this site you are agreeing to our use of cookies. Click on the banner to find out more.

Cookie Settings