Although the topic of big data has enjoyed popularity for a number of years, solving the big data problem is far from over and still poses a considerable challenge and opportunity for many companies. Big data analytics promise to help identify service problems, speed up manufacturing processes, improve customer service and more.
In the last year, two major points have become very clear to organisations embarking on big data projects - firstly, the importance of data cleansing, and second, the potential of tools that can not only process big data, but also simulate it and detect issues before they affect the whole organisation.
Problems of scope
Many people forget the sheer scope of big data and its capacity is growing every day. Humans process a lot of information each day, but in a data centre, data processes are elevated and escalated on an enormous scale - devices in that location can generate at least 700 million pieces of “useful’” information each day alone. This amount of information is quite simply incomprehensible to a human being.
Unfortunately, this information is also inconsistent, presented in different ways and exceedingly difficult to analyse in its raw form, requiring both machine learning and human supervision. First and foremost, data centres tend to use equipment from a variety of different manufacturers, and data centre staff also have little control over the brands of equipment that customers co-locate in their premises. Different types of equipment output information in different ways, making centralisation and analysis very difficult - while a human can quickly pick up on the difference between the UK day/month date format and the US month/date format, a machine may not, unless it is specifically trained to, for example.
This is where data cleansing - using a combination of human input and machine learning - comes into play to harmonise the data and ensure that it is presented consistently. Human supervision is vital at this step to ensure that machine learning is working correctly and, for example, dates aren’t re-presented as month/day when the company uses day/month.
Once data is harmonised, however, a baseline of activity can be set and big data engines can begin to watch for anomalous activity. This activity and detection process must be tied to specific hardware, software, or even a specific location in the data centre to be most useful. Calling an engineer with a message saying, “there’s a problem in the data centre, can you find it?” will result in a very frustrated engineering team.
But once analytics engines are trained, localising problems to distinct areas can pay dividends, not only in pinpointing problems, but also in predicting future issues.
Introducing virtual canaries
Big data analytics doesn’t just have to analyse real data - it can look at simulated data as well. Once you have “clean” data and an established baseline for what “normal” customer activity looks like in the data centre, you can apply this approach to the future in two significant ways.
Primarily, once an organisation has solved an issue, it can carry out data forensics, looking at the factors which led up to the problem or outage. By identifying the precursors in the seconds, minutes or hours before an incident, these can be flagged next time and problems avoided before they occur.
Secondly, organisations can establish virtual “canary” machines which simulate normal activity and are then configured to detect common precursors. This enables companies to pre-empt problems on live servers without putting any additional load or software on a customer server. Ultimately, the aim of a canary is to spot issues before they arise, applying fixes to live customer servers based on canary information before they turn into “real” issues. This is a hundred miles from the age-old IT call letting customers know that problems are occurring, but are being solved - or worse, a customer calling because they noticed a problem.
This approach is dissimilar to Amazon’s chaos monkey methodology, where software tools regularly cause problems in their cloud servers to test the overall resilience of the cloud set-up. However, both have merit in terms of automation, machine learning and improving customer experience.
A human future
Whether your organisation uses canaries, monkeys, or both, one thing is certain: there will always be a place for humans in the data centre. While machine learning is improving in leaps and bounds, there is simply no substitute for human experience and skills. Data centre professionals with years of experience can detect, analyse and diagnose information in a more selective and educated fashion than an automated assistant, even one which analyses 700 million pieces of information a day.
Running a big data analytics project in an environment as complex as the data centre can be a daunting process. However, by taking a logical and considered approach to it, and understanding where machine learning and automation can take the strain, IT professionals can not only help to solve problems more quickly, but, at some stage, tackle them before they start to impact customers. When this happens, analytics staff will not only have improved the technology within the business, but will also have gone a significant way to improving the very nature of the business itself.
Top tips for data centre big data analytics projects
1. Keep it clean: Data cleansing is a time-consuming, but vital part of the project - it’s essential to harmonise how the data is presented to the analytics system. Small differences in formatting and presentation can make a world of difference.
2. Establish a baseline: Monitor the data “exhaust” at a time when nothing is going wrong in the data centre. This will provide a sound basis for future evaluation and monitoring.
3. Look out for anomalies: Items falling outside of the baseline should be investigated. They may represent faulty or poor-performing hardware or software.
4. After an issue has occurred and been solved, consider data forensics - tracking the precursors to an incident can provide vital indicators which will help you to avoid the next one.
5. Consider canaries: Running virtual machines simulating standard customer activity can help to model new situations or gain an early warning of forthcoming trouble.
6. Don’t abandon the people: automation and machine learning is advancing in leaps and bounds, but there’s no substitute for human experience and knowledge.