by Jelani Harper 23 September 2019
The massive pattern recognition capabilities of artificial intelligence technologies like machine learning are becoming ubiquitous across the enterprise partly because of the way data is distributed. With data dispersed throughout the cloud, on-premises, and edge locations, advanced analytics is frequently required to understand information in relation to business objectives.
Essential to the prevalence of AI technologies is the deployment of containers and container orchestration platforms such as Kubernetes. With containers, organizations can dynamically position applications as needed to account for the extreme distribution of modern data assets—and their scale.
According to Lucas Roberts, chief solutions architect at Vizru, the main benefit of containers and Kubernetes is “scale and portability. I’m really referring to the ability to take the entire platform, lift it, and put it somewhere else so it can run where it needs to run.”
When that destination is in the cloud, organizations are suddenly empowered by the capacity to do “what we call burst computing, using Kubernetes as the way to spin up and spin down compute on demand very, very rapidly,” noted Cambridge Semantics CTO Sean Martin.
Burst computing alludes to the capability to accommodate surging demands in cloud resources—termed ‘bursting’—as they occur. Kubernetes and containers are critical for burst computing because, as Roberts detailed, they can enable computational resources at the scale required for data-intensive AI jobs, for example. Thanks to the elastic nature of cloud computing, burst computing “expands and contracts,” Martin said. “It bursts out when you need it, and then it contracts.”
For example, when certain popular online shows are broadcast, “all of a sudden Netflix’s traffic explodes,” explained Scott Prutton, engineer at Hyland DevOps. “80 percent of the Internet used that day is to watch [the show]. Containers is how they do this. They basically have millions of containers running at any given time.” When the demand for these resources decreases (after the show has aired, for example) engineers can shut containers off. The flexibility and value of distributing data resources this way is essential in the AI age. According to Martin, “burst [computing] is like a sprinter. When you need extra capacity you ask the cloud to get it or maybe you shop around automatically, and figure out where’s cheapest for the next couple hours and you use it.”
Kubernetes provides a unique role in burst computing, as it’s the most common mechanism for managing containers. “Kubernetes is a cloud-native container scheduler and orchestration platform,” Prutton explained. “When I say container scheduler, I mean I can tell Kubernetes exactly how my application runs and what it needs to run, and Kubernetes will figure out where in the data center to run that.”
Besides taking the pain out of workload management, Kubernetes enables users to deploy containers at the scale required for contemporary computing. “A Kubernetes cluster could be any number of systems combined together that Kubernetes works with,” Roberts said. “You could have a different storage solution than another person, but if you’re running Kubernetes, it’s all Kubernetes to what you’re trying to run on top of it.”
The portability implicit to the containers Kubernetes orchestrates is also important for regulatory compliance, data governance, and pricing concerns. Certain organizations “need to have things running behind their own firewall,” Roberts noted, so not everything can be moved to the cloud. For example, in heavily regulated verticals like healthcare and the pharmaceutical industry “these are companies that would most likely benefit from being able to run [AI] platforms on their own systems, as there’s a lot of compliance concerns.”
The Netflix use case Prutton referenced is pivotal because it illustrates the need to scale horizontally—which is a direct effect of the sprawling nature of the data sphere. With tools like containers and Kubernetes, scaling solutions horizontally for implementing AI becomes fairly simple. “Being able to really use the components within your IT teams is what really allows you to scale your own Kubernetes instances,” Roberts acknowledged. “Once we have that running, adding additional nodes for Mongo or for the application itself becomes straightforward.”
Most of all, Kubernetes handles the underlying complexity of implementing a variety of containers at scale for burst computing and other techniques. “It helps manage what we call the sea of compute,” Prutton said. “Hardware is becoming more and more computationally dense. It’s really, really hard for an administrator to keep in mind all of the servers in a data center and where do these applications fit best. It’s really nice to say, ‘Kubernetes, here is exactly how my application needs to run…you go figure out where in the datacenter to run these applications’.”
The value derived from burst computing, Kubernetes, and containers is likely to increase as AI deployments become more commonplace. Similarly, the need to scale horiztonally will likely grow, placing an even greater premium on the capabilities of Kubernetes. By using it to position data resources anywhere inside the enterprise, outside the enterprise, in the cloud, organizations can capitalize on the benefits of AI and the distributed data ecosystem without the conventional restraints of IT.