If you are active in Machine Learning and know a thing or two about cloud services and technologies, you probably came across Kubernetes somewhere.
In this post, I will explain the core concepts of Kubernetes and how this can be helpful for running machine learning (ML) workloads.
Table of contents
- Microservice applications
- What is Kubernetes 2a.What is Kubernetes used for?
- Kubernetes and Machine Learning
- When to use Kubernetes. When to not use Kubernetes
- Complexity of Kubernetes
It’s all about microservices now
Current software architecture patterns have a strong focus on modularity and distribution. We aim to break large applications (‘monoliths’) up into a set of smaller services that can be developed and maintained individually. These services are connected in the same network and communicate by using APIs, pub/sub channels or similar technologies.
The upside of breaking a large application into multiple parts (microservices) has several benefits. First of all, developing an individual service with a well-defined API is easier than developing and maintaining a single large application with many internal methods and dependencies. Furthermore, as you can replicate individual services and scale them horizontally, a large application can be made much more efficient. Especially with the help of cloud services and technologies that enable rapid assignment and scaling of compute resources. Third, because of a decoupling of internal microservices, there is more redundancy and when set up correctly, no single point of failure anymore.
Microservice architectures lend themselves well to cloud environments where rapid resource allocation and deallocation is possible. There are always parts of an application that require more compute resources to run than others. So when necessary, these parts can be scaled up or down in a much more granular way than with a monolithic architecture, because you have to scale the entire application.
Container technologies like Docker help a lot in making this possible. Each microservice can be packaged with its necessary dependencies and run on any hardware without compatibility issues. Also, it makes the maintenance of individual services a lot easier.
What is Kubernetes?
Originally developed by Google, Kubernetes is a platform that helps you deploy, run and maintain large microservice-based applications efficiently. Kubernetes helps to make compute resource allocation automatic, reduce risk with failover capabilities and deployment strategies, networking and many other things
To get a better idea of what we are talking about, here is an overview of the core Kubernetes concepts:
Kubernetes Cluster. This consists of a Master node which runs the main Kubernetes service and several Worker nodes that are available for processing within the cluster. You can assign as many worker nodes as you need to a cluster. The master node ensures that the user-defined configuration of ‘pods’ is maintained at all times. Finding the most optimal configuration.
Kubernetes Pods. These are the smallest units in a Kubernetes deployment. They are essentially containers, often Docker-based, that can spin up or down. Also, if a service is not needed, Kubernetes can make sure it doesn’t consume any resources.
Basically, you can tell the Master node with a configuration file which services need to be live and how many of them. Kubernetes then figures out by itself how to spin up and run these pods (containers) in the most efficient way on the available workers.
So if demand increases, or a queue forms, Kubernetes can scale individual pods accordingly to keep everything running smoothly. And, above all, keep your application quick and available for all users.
How does this help with Machine Learning?
Machine Learning models tend to be pieces of software that usually don’t really fit in well with other parts of the IT architecture (read more about the MLOps). This is because they depend on many different libraries, require parameter files and other artifacts, and also they can have very different resource requirements to run them. A Tensorflow based NLP algorithm needs a much larger runtime and different dependencies than a Scikit-Learn regression model. Some need GPUs, others don’t.
As I described above, microservice architectures are well suited for running and maintaining applications consisting of services with different needs. Therefore, they also lend themselves well to running ML algorithms.
There are several open-source extensions to Kubernetes that help you with running Machine learning workloads, like Kubeflow. But they are not the easiest to set up and master.
When (not) to use Kubernetes?
Kubernetes is originally designed for orchestrating microservice applications at scale. And with scale, I mean Google-like scale. They developed it in-house for themselves before open sourcing it. So if you are planning to run many services and models simultaneously in a high-demand environment, Kubernetes might be for you. If you only need to run a single service or app in a redundant way. Docker Swarm or a cloud service like Cloud Run from Google might be a better choice.
Downsides of using Kubernetes
– Architecture becomes more complex
– You need to take care of networking, service discovery, load balancing and monitoring.
– Security becomes more challenging. (more complexity means a larger challenge to keep things secure)
Beware, Kubernetes can get complicated.
Based on the above, it might sound like a no-brainer to give Kubernetes a try. And I encourage you to play around with it to get a feel for what it can do. However, running Kubernetes in production, tailoring it to your wishes and maintaining the clusters and setups is a job on its own. You won’t be the first one to set off on a months-long journey tailoring and tweaking Kubernetes until it really works for you and your team. Managed Kubernetes services like EKS, Google Kubernetes Service and Azure Kubernetes make this already easier. But it’s not a technology that’s very plug-and-play yet.
So, is Kubernetes a good thing for Machine Learning (ML)? Definitely, as it helps running, orchestrating and scaling models efficiently independent of their dependencies, how often they need to be active and how much data they need to process. At UbiOps for instance, we use Kubernetes to orchestrate and run all the analytics workloads of our users. This way, we make it easy for users to make use of the powerful auto-scaling capabilities of Kubernetes, without them needing to understand this software and how to implement it.
There is an increasing number of open source extensions to Kubernetes like KNative, Istio, Kubeflow and KFServing that help you with running serverless workloads, machine learning pipelines and other services on top of Kubernetes, abstracting away some of the complexity (but sometimes also adding new). This will continue to evolve, making Kubernetes more accessible for everyone to deploy and manage data driven applications at scale.
Also, many serving and orchestration frameworks use Kubernetes in the background, providing you with the power of the tool, without requiring the knowledge of setting up and maintaining this quite complicated piece of infrastructure.
UbiOps is an easy-to-use deployment and serving platform. It helps you turn your Python & R models and scripts into live web services, allowing you to use them from anywhere at any time. So you can embed them in your own applications, website or data infrastructure. Without having to worry about security, reliability or scalability.