The eponymous open source platform makes it easier for developers to manage their AI datasets
Data science platform Pachyderm has closed a $16 million funding round led by Microsoft’s M12 venture arm.
Other backers include existing investors Y Combinator and Benchmark, and new interest from Decibel Ventures, a firm with close ties to Cisco Systems.
Launched in 2014, San Francisco-based Pachyderm has made a name for itself with an open source platform that enables explainable, repeatable and scalable data science, making it easier for software engineers to manage the data they use in machine learning projects.
Likened to the popular GitHub open source code repository, the service provides centralized controls for managing data files, enables developers to accurately recreate and repeat data-based experiments, and takes care of version control of the datasets that can quickly stack up in the development of AI projects.
Quicker data processing
Before being fed to a neural network, training data is usually put through a series of transformations to remove erroneous records and convert it into a format that’s easier to analyze. This is often a time-consuming process, but with Pachyderm, developers are able to run thousands of instances of pre-processing simultaneously thanks to integration with another open source project, Kubernetes.
Pachyderm has attracted a range of enterprise customers, including Shell, LogMeIn, Battelle Ecology and AgBiome, as well as government agencies, banks, pharmaceutical and bioinformatics companies, and others within the Fortune 500.
“Value from AI is not just about the first great insight. It’s ensuring that every insight only gets more accurate with time,” said Vaibhav Kumar, software engineering and fullstack development manager at Shell. “Pachyderm’s ability to deliver data lineage to data scientists is a great leap towards explainable AI.”
The latest round of funding – which takes the company’s investment total to $28 million – comes as Pachyderm is introducing Pachyderm Hub, the company’s fully-managed service that has been operating in public beta since November. With Pachyderm Hub, individuals and enterprise teams can get a Pachyderm cluster on-demand, without having to take on the operational burden of managing their own infrastructure.
Fun fact: the platform is named after a term for a large animal with thick skin, like an elephant, rhinoceros, or hippopotamus.