Lessons learnt from AutoTrader algorithms

ao link

Members

Contact

Free AI assessment

New to DataIQ?

Take our FREE data literacy indicator now

Unlock the power of data - take our FREE data literacy indicator now

Dr Peter Appleby is the head of data science at AutoTrader, having worked his way up from data scientist and lead data scientist positions. By going through the process of productionising algorithms, he learnt the important lessons of playing to one’s strengths and having one person take ownership of the entire pipeline. This is how he made those discoveries.

AutoTrader is a mid-size organisation of 800 people which Appleby described as agile because the product teams have delivery, tech and product leads within them.

"If we make changes to the model, we have to do it as soon as possible."

He and his team create models for internal and external audiences. However those that are public-facing need to be a very robust because they generate a large number of queries that could range from hundreds to millions per day. “If we make changes to the model, either in retraining or interrogating the algorithm, we have to make that change as soon as possible. If it suddenly changes from one day to the next, people mistrust the outputs,” he said.

Appleby gave an example from AutoTrader of an algorithm which adjusted the price valuation of vehicles depending on their specifications and would subsequently award the motor with a ‘good’ or ‘great’ price sticker. “We’re looking at a car and we’re saying ‘OK, even if it is £1,000 more expensive than a similar car, as a package it is actually a better price because of the optional spec items that are on it’,” said Appleby.

He described the original spec-adjusted valuation engine in the form of a flow chart, with car data going in, being processed into data labelled in terms of spec items and the valuation coming out at the other end. Appleby explained that the data engineers were responsible for extracting data from the car adverts and marking it up with the specifications.

The data scientists were in charge of the training model that valued the spec items. Then the data engineers were involved again in writing the interrogation model that altered the coefficients and got interrogated by the API. The product people were at the end of the process in charge of writing the API and servicing that result as a product.

The creation and deployment of this model was not entirely smooth. “There were obvious gaps in responsibility that things could fall into and did. There’s no end-to-end ownership of the whole chain and this led to a number of problems, particularly with changes in the spec extraction model,” said Appleby.

As there was no clear end-to-end ownership, Appleby said they tended to focus on individual components instead of the whole chain and changes that were made had downstream consequences that they couldn’t handle. As a result of all of these factors, Appleby said this was: “the most complicated product that AutoTrader has ever produced.”

And so, they came up with some principles to solve those problems. The first was to find someone to take end-to-end ownership of the whole pipeline. “We need to have clear responsibilities of the ownership of the different sections. That can be shared ownership. That’s fine, as long as it’s clear who is responsible for what.”

"We want data scientists to do what they’re good at - discovery, model selection, training.”

Another was to play to their strengths. This meant that everyone in the team doing what they are best at. “We don’t want data scientists writing perform and productionised code that’s interrogated by an API. We want them to do what they’re good at which is the discovery, model selection, and training,” he said. They also decided to scrap the translation layer altogether.

Now there is just one code base and it is looked after by one person. “We are more confident that we are not going to get discrepancies between the training and the interrogation code.” The data engineers are still involved in loading and shifting data into the data lake or data warehouse, while the product team still does the API but now they are taking shared ownership of the actual model.

He said that data science sits in the middle like glue holding things together. “Having done the discovery, they are best placed to evaluate the output of the model on an ongoing basis and make decisions as to whether it’s still doing what we expected it to do.”

According to Appleby, with this simpler model they have a much better approach as gaps have been eliminated and they have a much more joined up view of the ecosystem.

Dr Peter Appleby was speaking at AI Congress London.

Log in to read the entire article

Gain access to the entire article by logging in or registering for a free account here.

Did you find this content useful?

Thank you for your input

Thank you for your feedback

Next read

A case of the AI biter bit?

DataIQ’s Chief Knowledge Officer and Evangelist, David Reed, examines the hype cycle around generative AI and the actual speed of transformation being seen.

Next read

A case of the AI biter bit?

23 Apr 2024by David Reed

DataIQ’s Chief Knowledge Officer and Evangelist, David Reed, examines the hype cycle around generative AI and the actual speed of transformation being seen.

Pioneering AI initiatives revealed: DataIQ Announces 2024 AI Awards Shortlist

15 Apr 2024by Alex Roberts

The shortlist for the 2024 DataIQ AI Awards has been unveiled, with the winners to be announced at the DataIQ Summit on May 21.

Final chance to enter the 2024 DataIQ Awards and demonstrate your team’s prowess

08 Apr 2024by Alex Roberts

The final deadline for submissions to the 2024 DataIQ Awards – 26 April – is rapidly approaching, so make sure you have entered to clinch a title.

You may also be interested in

DataIQ 100 Success Series: EDF – National sustainability and preparing for the unexpected

EDF’s head of data and CRM, and member of the DataIQ 100 Martin Aylward, spoke to DataIQ editor Alex Roberts, about what data leaders need to succeed and how investment in data teams can provide extreme unseen wins.

AI just rocked Las Vegas. But where was data?

DataIQ chief knowledge officer and evangelist, David Reed, examines the gamble surrounding AI and why businesses need to play the game.

Analytics and Insight artificial intelligence business leaders CIO data objectives digital information gamble Prediction Technology tools US vegas

DataIQ 100 Success Series: Data Driven Danske – Leveraging data in a new way for legacy business

Legacy businesses have a unique set of challenges when adopting a new data-driven future. Data Driven Danske is a transformational journey taking Danske Bank employees to the next level of leveraging data and analytics to drive value for customers, shareholders, colleagues and broader stakeholders.

Analytics and Insight business leaders data culture data literacy data objectives DataIQ 100 finance Financial Services/Banking investment legacy talent Technology Technology and Tools

Newspapers, radio and television – An insight into the impact of generative AI on media businesses

With generative AI paving the way for a new era of data, businesses are rapidly seeking ways to incorporate tools into their operations, DataIQ member News UK delves into their approach.

AI Analytics and Insight artificial intelligence generative AI machine learning Media ML News skills Technology Technology and Tools upskilling

DataIQ is a trading name of IQ Data Group Limited
10 York Road, London, SE1 7ND

We use cookies so we can provide you with the best online experience. By continuing to browse this site you are agreeing to our use of cookies. Click on the banner to find out more.

Cookie Settings