If you are attending today’s Cloudera Sessions at the Old Truman Brewery on Brick Lane in London, it could be because you want to understand how the Office for National Statistics is building its data science capabilities or the journey BT is on towards the democratisation of data and self-service analytics.
Alternatively, you may be in an organisation that has just completed a digital transformation and is now thinking about datafication as the next step. If so, you could be on the brink of the sixth wave of automation of decision making, a pet theory of Amr Awadallah, co-founder and CTO, Cloudera.
During an interview at Big Data LDN, Awadallah summarised the progression he sees. The first wave was knowledge transfer through language 100,000 years ago, followed by agriculture which helped to automate decisions on food, thereby releasing more time for other human activities. Some 3,000 years ago, the invention of maths and geometry helped to automate discovery and the scientific method.
From there, it is almost a direct line through the automation of manufacturing in the industrial revolution (wave four) through to the automation of processes in the ongoing IT revolution (wave five). For Awadallah, it is the application of artificial intelligence and machine learning to big data sources that is unleashing the sixth wave. “There is strong interest in machine learning now with more organisations using it to leverage value from big data,” he said.
“Organisations need to move quicker or they will lose ground.”
If humankind and society finds itself in the midst of this sixth wave, then it is a direct result of the last two decades of the fifth wave technology and digital transformation. Many organisations are still struggling to complete this and Awadallah believes time is running out for them. “Organisations need to move quicker or they will lose ground,” he said.
Crucially, Awadallah says that, “you can’t jump a step. You have to do step five - digital transformation - in order to do step six, otherwise the business will not be ready and you will not have the data.”
As an example of this step-by-step approach, he cites JP Morgan, which has digitised 30 years’ worth of contracts and moved all of its ongoing contractual arrangements into digital platforms. These are now supported by machine learning which is able to automate many aspects of the contracting process and save 100,000 human hours as a result. He notes that JP Morgan had to undertake that digitisation first in order to have the data and the digital process for machine learning to discover and automate.
As a self-described “modern data platform” which grew out of Hadoop and the open source movement, Cloudera naturally sees itself as a core component of this sixth wave, enabling data to be ingested from almost any source and then analysed for the new purposes of machine learning and advanced analytics. Yet when the company was founded in 2008, predictive analytics and the internet of things were not in view as use cases. “That only started about two years ago,” acknowledged Awadallah.
“Humans resist change - they want to stay on the systems they know.”
Given the intelligence of its founders, who all hail from Silicon Valley - Awadallah from Yahoo!, Christophe Bisciglia from Google, Mike Olson from Oracle and Jeff Hammerbacher from Facebook - it is perhaps not surprising that it has been able to develop and flex its proposition as the fifth wave has continued to evolve. Backing from Intel and an IPO in April that has seen the business valued at around $2 billion have helped to put in place the resources necessary for its open source, open standards, open markets play.
But getting organisations to adopt this approach as part of their enterprise architecture is still no easy task. “Humans resist change - they want to stay on the systems they know. That is also about their sense of job security,” said Awadallah. “Our job is to get companies to think top-down, looking at the cost-benefit, as well as bottom-up, thnking about the skills they need, so it becomes a meeting of minds. That is not easy, but it is easier with companies that want to do step five and those that have already established their big data systems.”
“We don’t tell clients to change their legacy systems.”
A number of factors come into play to help make these migrations happen. One of them is an absence of monotheistic zeal about the right data architecture for wave six. Cloudera is a business partner with Oracle and Teradata and has many clients who use both architectures, although it competes against IBM Watson and Hortonworks.
“We don’t tell clients to change their legacy systems. Those databases are good for more transactional activities and are the right tools for the job. We’re a multi-programme system so, by definition, we are not as good at single programme processing,” he explained. “We are right for 70%, but 30% still need Oracle, DB2, Neteeza or SQL.”
A second factor is finding the right entry point in organisations. Two communities are picking up on the vendor - data scientist who find that legacy systems are not scaling well as they try to deliver more predictive business insights and those who need to query multiple data types in order to deliver better insights, such as understanding a customer journey across all channels. “We usually start with the first type and then develop as the organisation builds its competency,” he says of Cloudera’s “land and expand” approach.
Building those skills is the third part of the vendor’s proposition. Said Awadallah: “If you are thinking about doing machine learning and data science, those are new skills. They can be hard to build or hire, which is why we offer them - we have people who can do the data engineering in Java and Python or statistical modelling using SAS. We also run the Cloudera University to upskill in-house analysts.”
While Awadallah is a technologist at heart, he is just as open to discussing the big picture of where it is taking the economy, society and humankind. He made a point of explaining that Cloudera has a strong ethical stance and spent some time arguing the issue of AI’s impact on employment and the possible need for a universal basic income. Ulitmately, though, his position is that of a scientist: “It is like building any tool - it can be for good or bad, it is down to how the customer uses it.”
Related articles: Fast forward to the machine future