KNIME removes barriers between model development and production

KNIME removes barriers between model development and production

Max Smolaks

April 3, 2020

3 Min Read

The new Integrated Deployment functionality saves time and effort

by Max Smolaks 2 April 2020

KNIME,the open source data analytics platform, hasadded functionality that enables data scientists to move models fromdevelopment to production without having to alter any code.

IntegratedDeployment identifies andpackages notjust the model, but all ofits associated datapreparation andpost-processingsteps so they can beautomatically reused.

“Thissolves perhaps one of the biggest problems in data science today bycompletely eliminating the gap between the art of data sciencecreation and moving the results into production,” said MichaelBerthold, co-founder and CEO of KNIME.

Integrated Deployment was launched at the KNIME Spring Summit 2020, taking place this year as an online-only event.

KNIME creates a workflow to generate an optimal model

"Productionize"

The development of KNIME Analytics Platform (from Konstanz Information Miner) is led by KNIME the company, headquartered in Zurich. It is used for a variety of purposes including data mining, business intelligence and machine learning.

KNIME (the platform) started out in 2006 as a proprietary software product, but made a pivot to GPLv3 – the most ‘hardcore’ free and open source license – with the release of version 2.1 in 2009. This means it can be downloaded, shared and modified without any restrictions, and there’s even a special provision that enables other companies to develop new ‘nodes’ for KNIME and sell them.

The Integrated Deployment process aims to simplify the lives of data scientists that build their models on KNIME. Previously, moving a model into production required manual replication of the exact data creation and model settings; now these can be maintained automatically.

Here’s how it works, according to the company: “Usingopen-source KNIME Analytics Platform, a workflow is created togenerate an optimal model. Integrated Deployment allows a datascientist to mark the portions of the workflow that would benecessary for running in a production environment, including datacreation and preparation as well as the model itself, and save themautomatically as workflows with all appropriate settings andtransformations saved. There is no limitation in this identificationprocess — it can be simple or as advanced (and complex) asrequired.

“With KNIMEServer in production, these captured workflows are then referencedand reused. There is no need to rewrite or recode any of theprocess.”

Get the newsletter
From automation advancements to policy announcements, stay ahead of the curve with the bi-weekly AI Business newsletter.