As the head of business intelligence at IAC Publishing Labs, it is Erika Bakse’s job to make sure that data and analytics is available to all business users and data users - “basically, the whole company,” in her words. She told DataIQ in an interview that she enjoys working in a small team, which is ideal as there are just three people in hers - herself as manager, a data engineer and a Dublin-based data analyst. There are also two off-shore contractors in India who help with day-to-day operations.
To understand what changes are coming to the website and properly integrate log-in data to the database, Bakse and her team work closely with the front-end engineering teams. They then talk to the product side and make sure that any questions they foresee as they are building the product can be answered in the data warehouse. In any given day, she will have many meetings with a lot of status updates - essentially, a lot of sitting down and understanding what the data looks like and how they can make sure that it is well defined and understandable for everybody.
Bakse is an enthusiastic champion of the Snowflake cloud-based data warehouse, telling the audience at the Cloud Analytics Conference that it has made a significant difference to the efficiency and productivity of herself and the rest of the data team. When she first joined IAC, the data warehouse had only been implemented a few months prior. It was an on-premises solution and Bakse described it as being at the end of its lifecycle, so they were running into a lot of performance issues.
“The number one issue was concurrency. We would have 20 analysts running queries and, next thing you know, I was getting pings that the whole data warehouse wasn’t working,” she recalled. As a result, queries that would usually take two minutes were taking 30. In addition, Bakse and her team were unable to compartmentalise their processing. This meant that they had to do very large batch processes daily outside of business hours so that users would not be affected.
It took three months to migrate from the on-premises data warehouse to Snowflake and, as the license for the old warehouse was expiring, there was no leeway on the deadline. Since completing the migration in early 2016, Bakse and her team have been able to process data every 15 minutes, instead of once a day. The department head also said that she and her team can now get data more rapidly which, in turn, increases the momentum of their proofs of concept and different tests that they do on the website.
“I coined it ‘cleaner data, faster, cheaper,’ which is something that simply wasn’t possible with our on-premises data warehouse,” said Bakse. Another advantage Bakse identified was the vast amount of storage. In the past, they were only able to retain six months of raw data, even though the old data warehouse was a 26-node cluster. She said that now they can store 18 months of data and it is only due to legal compliance reasons that they are not able to keep it any longer without anonymising it. “Snowflake has the mark of a really good platform. You don’t really know it’s there. It’s just solid,” she said.
Since becoming a manager, she misses “being in the guts of the technology,” but she does enjoy having an holistic view of the company, as well as learning how to strategise and build a road map. “I’ve had some successes, so I feel a little bit more confident about the role,” she said.