Dirty data and virtual canaries: Lessons in data centre analytics

ao link

Members

Contact

Free AI assessment

New to DataIQ?

Take our FREE data literacy indicator now

Unlock the power of data - take our FREE data literacy indicator now

Although the topic of big data has enjoyed popularity for a number of years, solving the big data problem is far from over and still poses a considerable challenge and opportunity for many companies. Big data analytics promise to help identify service problems, speed up manufacturing processes, improve customer service and more.

In the last year, two major points have become very clear to organisations embarking on big data projects - firstly, the importance of data cleansing, and second, the potential of tools that can not only process big data, but also simulate it and detect issues before they affect the whole organisation.

Problems of scope

Many people forget the sheer scope of big data and its capacity is growing every day. Humans process a lot of information each day, but in a data centre, data processes are elevated and escalated on an enormous scale - devices in that location can generate at least 700 million pieces of “useful’” information each day alone. This amount of information is quite simply incomprehensible to a human being.

Unfortunately, this information is also inconsistent, presented in different ways and exceedingly difficult to analyse in its raw form, requiring both machine learning and human supervision. First and foremost, data centres tend to use equipment from a variety of different manufacturers, and data centre staff also have little control over the brands of equipment that customers co-locate in their premises. Different types of equipment output information in different ways, making centralisation and analysis very difficult - while a human can quickly pick up on the difference between the UK day/month date format and the US month/date format, a machine may not, unless it is specifically trained to, for example.

This is where data cleansing - using a combination of human input and machine learning - comes into play to harmonise the data and ensure that it is presented consistently. Human supervision is vital at this step to ensure that machine learning is working correctly and, for example, dates aren’t re-presented as month/day when the company uses day/month.

Once data is harmonised, however, a baseline of activity can be set and big data engines can begin to watch for anomalous activity. This activity and detection process must be tied to specific hardware, software, or even a specific location in the data centre to be most useful. Calling an engineer with a message saying, “there’s a problem in the data centre, can you find it?” will result in a very frustrated engineering team.

But once analytics engines are trained, localising problems to distinct areas can pay dividends, not only in pinpointing problems, but also in predicting future issues.

Introducing virtual canaries

Big data analytics doesn’t just have to analyse real data - it can look at simulated data as well. Once you have “clean” data and an established baseline for what “normal” customer activity looks like in the data centre, you can apply this approach to the future in two significant ways.

Primarily, once an organisation has solved an issue, it can carry out data forensics, looking at the factors which led up to the problem or outage. By identifying the precursors in the seconds, minutes or hours before an incident, these can be flagged next time and problems avoided before they occur.

Secondly, organisations can establish virtual “canary” machines which simulate normal activity and are then configured to detect common precursors. This enables companies to pre-empt problems on live servers without putting any additional load or software on a customer server. Ultimately, the aim of a canary is to spot issues before they arise, applying fixes to live customer servers based on canary information before they turn into “real” issues. This is a hundred miles from the age-old IT call letting customers know that problems are occurring, but are being solved - or worse, a customer calling because they noticed a problem.

This approach is dissimilar to Amazon’s chaos monkey methodology, where software tools regularly cause problems in their cloud servers to test the overall resilience of the cloud set-up. However, both have merit in terms of automation, machine learning and improving customer experience.

A human future

Whether your organisation uses canaries, monkeys, or both, one thing is certain: there will always be a place for humans in the data centre. While machine learning is improving in leaps and bounds, there is simply no substitute for human experience and skills. Data centre professionals with years of experience can detect, analyse and diagnose information in a more selective and educated fashion than an automated assistant, even one which analyses 700 million pieces of information a day.

Running a big data analytics project in an environment as complex as the data centre can be a daunting process. However, by taking a logical and considered approach to it, and understanding where machine learning and automation can take the strain, IT professionals can not only help to solve problems more quickly, but, at some stage, tackle them before they start to impact customers. When this happens, analytics staff will not only have improved the technology within the business, but will also have gone a significant way to improving the very nature of the business itself.

Top tips for data centre big data analytics projects

1. Keep it clean: Data cleansing is a time-consuming, but vital part of the project - it’s essential to harmonise how the data is presented to the analytics system. Small differences in formatting and presentation can make a world of difference.

2. Establish a baseline: Monitor the data “exhaust” at a time when nothing is going wrong in the data centre. This will provide a sound basis for future evaluation and monitoring.

3. Look out for anomalies: Items falling outside of the baseline should be investigated. They may represent faulty or poor-performing hardware or software.

4. After an issue has occurred and been solved, consider data forensics - tracking the precursors to an incident can provide vital indicators which will help you to avoid the next one.

5. Consider canaries: Running virtual machines simulating standard customer activity can help to model new situations or gain an early warning of forthcoming trouble.

6. Don’t abandon the people: automation and machine learning is advancing in leaps and bounds, but there’s no substitute for human experience and knowledge.

Log in to read the entire article

Gain access to the entire article by logging in or registering for a free account here.

Did you find this content useful?

Thank you for your input

Thank you for your feedback

Next read

Data Literacy versus Data Culture – DataIQ’s view

DataIQ explains the differences between data literacy and data culture as understanding the differences is essential to achieve buy in and support from business leaders.

Next read

Pioneering AI initiatives revealed: DataIQ Announces 2024 AI Awards Shortlist

15 Apr 2024by Alex Roberts

The shortlist for the 2024 DataIQ AI Awards has been unveiled, with the winners to be announced at the DataIQ Summit on May 21.

Final chance to enter the 2024 DataIQ Awards and demonstrate your team’s prowess

08 Apr 2024by Alex Roberts

The final deadline for submissions to the 2024 DataIQ Awards – 26 April – is rapidly approaching, so make sure you have entered to clinch a title.

Data Literacy versus Data Culture – DataIQ’s view

03 Apr 2024by Rachael Pimblett

DataIQ explains the differences between data literacy and data culture as understanding the differences is essential to achieve buy in and support from business leaders.

You may also be interested in

International collaborative AI safety agreement signed

DataIQ is a trading name of IQ Data Group Limited
10 York Road, London, SE1 7ND

We use cookies so we can provide you with the best online experience. By continuing to browse this site you are agreeing to our use of cookies. Click on the banner to find out more.

Cookie Settings