Pancakes, bicycles and Apache Spark
by Max Smolaks 8 October 2019
American startup Databricks, established by the original authors of the Apache Spark framework, is planning to spend €100 million ($109.65m) over the next three years to expand its AI lab in Amsterdam.
Databricks says it tripled the size of its engineering team in Amsterdam over the past two years, and its European Development Center will require many more data scientists to work on its Unified Data Analytics Platform.
“The company’s growth is a testament to its ability to attract a skilled workforce that wants to live and work in a vibrant city like Amsterdam, as well as Dutch infrastructure like our world-class broadband network,” said Henny Jacobs, executive director of the Americas for the Netherlands Foreign Investment Agency.
From Silicon Valley to Silicon Canals
Databricks was co-founded in 2013 by a team of academics that met at Berkeley, including computer scientist Matei Zaharia, who developed Spark as a PhD thesis in 2009 and later co-created the Apache Mesos cluster manager. Both projects were released under an open source license.
Spark is a cluster computing engine that relies on in-memory processing, and the de-facto standard for handling really large datasets. Although it wasn’t developed specifically for machine learning, Spark has been embraced by the AI community for its scalability, language compatibility, and speed.
The open source version of Spark, maintained by the Apache Foundation, is free to use; Databricks makes its money by selling a fully managed version of the software, hosted in the cloud. This is true open source, not the frequently maligned open core.
And this model certainly works: a few years ago, Databricks reached valuation of more than $1 billion, which meant some people inevitably started calling the company a ‘unicorn.’ Today, the valuation stands somewhere around $2.7bn, with Databricks securing $250 million in its most recent funding round in February.
In June, the company capitalized on the popularity of Spark among machine learning enthusiasts by releasing MLflow, a machine learning management engine designed to simplify AI projects.
MLflow enables data scientists to track and distribute experiments, package and share models across frameworks, and deploy them – no matter if the target environment is a personal laptop or a cloud data center. Just like Spark, MLflow is available for free, and Databricks sells a managed version hosted with either AWS or Azure.
The company brought its software to Europe in 2017, and the Amstrdam office is expected to total 200 staff by the end of 2019.
“Our investments in Amsterdam over the next three years will support our mission to help data teams solve the world’s toughest problems, and continuing to build a top notch engineering squad in Amsterdam is integral to our success,” said Ali Ghodsi, co-founder and CEO at Databricks.
Amsterdam is currently competing against London and Berlin for both tech talent and corporate investment. The city is one of Europe’s largest hubs for digital infrastructure – along with Frankfurt, London and Paris, sometimes referred to as the FLAP markets by data center professionals.
AI Business will be reporting from the upcoming Spark+AI Summit in Amsterdam, taking place on 15-17 October.