AI Business is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 3099067.

AI Practitioner pitches open source alternative to AWS SageMaker and Azure ML Engineer

by Max Smolaks
Article Image

Silicon Valley startup has launched major updates for Data Version Control (DVC) and Continuous Machine Learning (CML) software projects, as it hopes to create open source alternatives to machine learning toolsets from major cloud providers like AWS and Microsoft.

The company enables ML engineers and data scientists to more easily work with standard development tools like Git and popular CI/CD stacks.

“AI Platforms are siloed and require everything to go into their own systems creating vendor lock-in,” said Dmitry Petrov, founder and CEO. “ allows users to stay within their application development space and effectively extend the familiar dev environments with tools to support Machine Learning Engineers and Data Scientists.”

DVC and CML fit into the emerging MLOps software category, which is concerned with moving machine learning models from development into production, and running them at scale.

GitFlow for data science was founded in 2018 to develop open source tools to streamline the workflow of data scientists. Today, its projects have more than 200 contributors, and are used by more than 400 companies.

The startup posits that instead of creating separate AI platforms, the industry should integrate ML workflows into current practices for software development.

DVC, for example, is built to make ML models shareable and reproducible, providing users with a Git-like interface for version control – across models, datasets, and intermediate files. It works with remote storage for large files in the cloud or on-premise network storage.

“Harness the full power of Git branches to try different ideas instead of sloppy file suffixes and comments in code,” advertises the DVC project website. “Use automatic metric-tracking to navigate instead of paper and pencil.”

The latest release, DVC 2.0, adds capability to run lightweight ML experiments without the need to commit any code to git, ML model checkpoints versioning, and better CPU/GPU resource allocation.

Meanwhle, CML claims to hide the complexity of clouds from data scientists and ML engineers. It offers an open source library for implementing continuous integration and delivery (CI/CD) – the backbone of modern DevOps – in machine learning projects. The project enables users to automate parts of their development workflow, including model training and evaluation, and auto-generate reports with metrics and plots.

It’s still early days for CML, which has just reached version 0.3.

You can find more technical details about DVC and CML in the video below:


More EBooks

Latest video

More videos

Upcoming Webinars

Archived Webinars

More Webinars
AI Knowledge Hub

Research Reports

More Research Reports


Smart Building AI

Infographics archive

Newsletter Sign Up

Sign Up