Comparison - CONTAINER ORCHESTRATION - Automatically Scaling a System Across Multiple Servers :

3. CONTAINER ORCHESTRATION

3.3 Comparison

Container orchestrators have been previously compared by Truyen et al. [12], Jawarneh et al. [24] and Pan et al. [25], all published in 2019. The study by Truyen et al. presents

“a descriptive feature comparison study of the three most prominent orchestration frame-works”, that is, of Apache Mesos along with Docker Swarm and Kubernetes [12]. The

research by Jawarneh et al. compares Docker Swarm, Kubernetes, Apache Mesos and Cattle in terms of their functionality and performance [24]. Pan et al. describe “a thorough comparison and performance benchmark of the two most popular container clustering and orchestration tools”, Docker Swarm and Kubernetes [25].

This thesis takes its approach for functional comparison from Jawarneh et al. [24], as they provide a simple yet powerful framework for comparing the orchestrators. As in their work, the comparison is done by splitting the functionality of the orchestrators into three layers: resource management, scheduling, and service management [24]. The compar-ison is based on the documentation of the most recent versions of the tools to ensure up-to-date results. The most up-to-date versions of Docker, which the Swarm mode is a part of, and Kubernetes at the time of writing this thesis are 20.10.5 [26] and v1.20.0 [27], respectively. The following subsections compare the functionality of Docker Swarm and Kubernetes in these three layers, followed by a performance comparison. The perfor-mance comparison consolidates the results of Jawarneh et al. [24] and Pan et al. [25].

After the performance comparison, Docker Swarm and Kubernetes are compared in terms of cluster deployment and management. Results of the comparisons are summa-rised and compared to existing research at the end of the chapter.

3.3.1 Resource Management

The resource management layer abstracts the underlying resources to ensure high utili-sation levels and low interference between containers. Managed resources include com-putational resources, volumes, and networking. Comcom-putational resources include CPU, memory, and sometimes even graphics processing [24]. Volumes are used to store data created by containers [8]. Volumes grant the containers access to the file system of the host machine or to a remote file system. Networking refers to the TCP and UDP ports and IP addresses that are used within the container network to access the containers [24]. Both Docker Swarm and Kubernetes provide containers with access to computa-tional resources, volumes, and networking resources. The main differences in resource management are in the volumes the orchestrators support and in the way they handle networking.

Docker Swarm supports two types of volumes: data volumes and bind mounts. Data volumes are storage that outlive services and exist independently of them. Volumes are specific to a host, meaning that services running on different nodes cannot access the same volumes. Bind mounts enable the services to access data on the host where the service is deployed. They work by mounting a file system path into the container. Docker advises designing Swarm applications in a way, that bind mounts are not needed. Bind

mounts require the mounted path to exist on each node, which is why the automatic rescheduling of services on different nodes could lead to unexpected behaviour [28].

Accessing an external file system such as NFS is made possible by using volume drivers [8]. Kubernetes extends the limited functionality of Docker Swarm. In addition to local volumes and bind mounts (hostPath volumes in Kubernetes terms), Kubernetes supports third-party volumes provided by Amazon Web Services, Microsoft Azure and Google Compute Engine. Using open solutions, such as Glusterfs or NFS, is also possible [29].

Docker Swarm supports overlay networks to enable secure network communication be-tween the services in the cluster [30]. Each service is assigned a unique DNS name, which enables querying the service through a DNS server included in Docker Swarm [15]. A service can be made accessible from the outside via a published port. The service can then be accessed on the published port on any of the nodes in the cluster [31].

Kubernetes creates a virtual network for the cluster, which enables connectivity between Pods. Pods are exposed to other Pods via Services¹. Pods can access each other via Services’ IPs and ports, that can be read from environment variables or resolved by using a DNS add-on [32].

3.3.2 Scheduling

The efficient use of resources is enabled by the scheduling layer. The scheduling layer places services in the cluster based on user input. Possible capabilities include config-uring the placement of a service within the cluster, scaling the number of service in-stances, checking the readiness of services to only include services that are ready, re-starting failed services, rolling deployment to update the application version without downtime and co-locating services [24]. The scheduling layer is also responsible for au-tomatic scaling, allowing the rapid elasticity expected from cloud applications.

Docker Swarm and Kubernetes both use a declarative service model for scheduling ser-vices. This means that the user can define the desired state of services in their applica-tion stack, and the orchestrator components try to retain the desired state in the cluster.

The declarative service model ensures the high availability of services. For example, if a cluster node crashes, the services that were running on it are started on the remaining nodes [15]. In Docker Swarm, the desired state is compared to the actual state of the cluster by the cluster manager. If there are differences between the states, the cluster manager tries to reconcile them. In Kubernetes, the state of the cluster is defined by

1 kept in italics to distinguish from other services

using different types of workloads, such as Deployments and ReplicaSets, and the Ku-bernetes components maintain the state and ensure high availability [33].

Docker Swarm’s state configuration enables the user to define the placement of a ser-vice, scale services, and restart failed services [28]. Rolling updates are supported by controlling the delay between deployments to different nodes [15]. There is no built-in capability for readiness checking or the co-location of services [24]. Using Kubernetes workloads enables the user to define the placement of a Pod, scale them, and restart failed Pods [34, 21]. The state of the cluster can be changed at a controlled rate, which enables rolling updates without downtime [17]. Containers can be co-located by using Pods, and Pod readiness checking is supported natively [35].

Docker Swarm is also missing a built-in way to scale services automatically based on resource usage. There are ways to achieve automatic scaling in Docker Swarm, but third-party software and additional configuration is needed [36, 37, 38]. Kubernetes has a component called Horizontal Pod Autoscaler that is used to automatically scale Pods based on CPU utilization level or custom metrics [39]. Achieving automatic scaling in Docker Swarm and Kubernetes is discussed in more detail in Chapter 5.

3.3.3 Service Management

The service management layer is used to deploy complex enterprise applications. It pro-vides functionality for the higher-level management of an application, such as attaching metadata to containers with labels, isolating containers with groups or namespaces, and dividing the incoming load by using load-balancing [24].

The service management features of Docker Swarm are limited to basics. It supports labels for adding metadata to containers [28]. Load-balancing is partially implemented in Docker Swarm. Exposed services can be accessed on each node and the requests are internally balanced among the service instances by using DNS. To ensure that the re-quests to the cluster are made via a healthy node, an external load balancer is needed to balance the incoming requests between the nodes [15]. It is not possible to isolate containers with groups or namespaces in Docker Swarm [24]. On the other hand, Kuber-netes provides a wide range of service management features. Objects like Pods can be labelled to add metadata to them, and labelled objects can be queried with selectors [40].

Services can be used as load-balancers within the cluster. External load-balancing is done by using an Ingress that exposes routes to the Services from outside the cluster [41]. Pods can be isolated in Kubernetes with namespaces. Namespaces are essentially virtual clusters running on the same physical cluster [42].

3.3.4 Performance

Container orchestration platforms have two main aspects that affect the performance of a containerised application: container runtime and orchestration overhead. Since both Docker Swarm and Kubernetes are often used with the same container runtime – Docker, or in the case of the latest Kubernetes releases containerd directly – this thesis focuses on the overhead created by the orchestration layer. However, it should be noted that Kubernetes supports a wider variety of container runtimes. This could enable the user to opt for a more efficient container runtime, should one exist. Performance differ-ences of Docker Swarm and Kubernetes have been studied and compared comprehen-sively by Jawarneh et al. [24] and Pan et al. [25], both in 2019. This subsection summa-rises the results of these studies.

The study by Jawarneh et al. measures the performance of the orchestrators by analys-ing the time they take to provision a cluster, provision applications with different com-plexities, provision a web application with a high number of replicas and recover from failures. The time it takes to provision an application was measured for images available locally and images retrieved from a Docker registry. The orchestrators were setup for the comparison by using Rancher, which can be used to automate the setup of container orchestrators. Cluster provisioning with Kubernetes took over twice as much time as it took with Docker Swarm. Jawarneh et al. believe that this is due to Kubernetes’ complex architecture. Provisioning times for applications with different complexities (applications consisting of 1 to 4 containers) were similar between the orchestrators independent from how the image was provided. For provisioning a web application with a high number of replicas, Kubernetes was faster when the image was available locally, but slower when it had to be retrieved from the Docker registry. The researchers believe that this is due to the overhead generated by the communication between the Kubernetes agent and the Docker registry. Failover time was measured in two cases: container failure and node failure. Kubernetes responded to container failure faster, but Docker Swarm was over 3 times faster to recover from node failure. Jawarneh et al. indicate that this results from the architectures of the systems: Kubernetes can handle failed containers with a local agent, while its response to node failure is handled by an event-based system that causes a chain effect that in turn induces delay. Docker Swarm handles node failures by using a heartbeat system, which provides a faster recovery [24].

In their research, Pan et al. compare the overheads caused by Docker Swarm and Ku-bernetes. The comparison was done by running 36 different benchmarks that test pro-cessor performance, memory speed and memory bandwidth for running single-container

applications. In addition, one experiment was done to test the execution of a multi-con-tainer application. The experiments were run on systems that use Docker Swarm and Kubernetes as well as on a baseline system using only the underlying container engine.

The orchestrators were then compared to the baseline with the null hypothesis of orches-trators having no performance effect and the alternative hypothesis of orchesorches-trators in-curring performance overhead. According to the results, there were only 2 benchmarks with meaningful overhead associated with Docker Swarm and even in those cases the overheads were low: 1.8% and 3.5%. Meaningful overheads associated with Kubernetes occurred in 1/3 of the benchmarks. The maximum overhead was 12.3%, mean 5.2% and median 4.5%. In the multi-container experiment, Docker Swarm showed no meaningful overhead, while the overhead associated with Kubernetes was 8.3% [25].

3.3.5 Cluster Deployment

The amount of work needed to deploy and manage the system depends on the platform chosen and is thus an important factor to consider when deciding which orchestrator to use. In this section, Docker Swarm and Kubernetes are compared in terms of cluster deployment and management.

A Docker Swarm cluster consists of one or more instances of Docker Engine that can run on a single physical machine or cloud server or that can be distributed across multiple machines [43]. As Docker Swarm’s architecture is simple and cluster deployment easy, the major cloud providers do not offer Docker Swarm as a service. Instead, a Docker Swarm cluster is often created from virtual machines provisioned by using an IaaS plat-form. Docker Swarm nodes must have Docker installed on them, and they must be able to communicate with each other over a network [44]. A swarm is created by initialising the cluster on one of the manager nodes with a single docker swarm init command, that provides the user with a command that can be copy-pasted and executed on other nodes to join them to the cluster. The init command also outputs a token, which is used to authenticate joining nodes. Once all the nodes have been joined to the swarm, it is ready to accept service deployments [45].

A Kubernetes cluster can be deployed on a local machine or in the cloud. Alternatively, a managed cluster can be acquired as a service [46]. As Kubernetes is a relatively com-plex orchestrator, many companies choose to buy it as a service. Google Kubernetes Engine (GKE) [47], Amazon Elastic Kubernetes Service (EKS) [48] and Azure Kuber-netes Service (AKS) [49] are examples of KuberKuber-netes as a Service (KaaS) that compa-nies use when they want to outsource cluster deployment and management. A minimum

viable Kubernetes cluster can be deployed in a private cloud with a toolbox called ku-beadm. To use kubeadm to create a cluster, each node must have kubelet and kubectl installed in addition to kubeadm. Once these tools have been installed and kubelet has been started, creating a cluster is very similar to creating one with Docker Swarm: the cluster is initialised on the master node and the worker nodes are joined to it. The major difference is that Kubernetes Pods communicate with each other via a Pod network, which must be deployed manually. Kubernetes supports Pod networks provided by sev-eral separate projects that use Container Network Interface (CNI). A CNI-based network can be deployed to Kubernetes like any other Kubernetes resource: by applying a YAML definition with kubectl [50].

3.3.6 Summary

The differences in Docker Swarm and Kubernetes are explained by their architecture and purpose. As an all-around container orchestrator, Kubernetes has more features with the downside of a more complex architecture required to support them. Docker Swarm is a more streamlined orchestrator intended as an easy and fast solution, but it lacks many of the features of Kubernetes in order to keep the architecture simple. Ad-vanced features such as automatic scaling are left for the user to implement, as is de-scribed later in this thesis. This subsection summarises the differences between Docker Swarm and Kubernetes and compares the results with existing research.

Table 1. Functional differences between Docker Swarm and Kubernetes.

Layer Feature Docker Swarm Kubernetes

Resource Management Memory + +

+ basic functionality, ++ advanced functionality

The functional differences between Docker Swarm and Kubernetes are summarised in Table 1. Advanced functionality is indicated in cases where both orchestrators support the functionality, but one does so to a significantly greater extent. As can be seen, Ku-bernetes’ complex architecture enables it to offer a more comprehensive set of features,

making it a more complete solution for enterprise applications. While Docker Swarm of-fers a narrower set of features, it includes enough functionality for many basic applica-tions. The results of the performance comparison presented in Table 2 indicate that the wide variety of features and complex architecture of Kubernetes come with the cost of performance overhead. The simple architecture of Docker Swarm adds little to no over-head compared to running containers on a single Docker host. Kubernetes performs better only when provisioning applications from local images and reacting to container failures, but even these advantages are minor.

Table 2. Performance differences between Docker Swarm and Kubernetes.

Study Measure Docker Swarm Kubernetes

Jawarneh et al. [24]

Cluster provision time ++

Application provision time (local image) +

Application provision time (Docker registry) +

Container failover time +

Node failover time ++

Pan

et al. [25] Single-container application performance + Multi-container application performance + + slight advantage, ++ significant advantage

Kubernetes is also more difficult to deploy. While kubeadm automates many of the steps, installing a Kubernetes cluster requires installing a lot of dependencies, which in turn requires configuring repositories and networks. In addition, the correct versions of the dependencies must be installed to ensure compatibility. Manual network configuration is needed as well. The installation process is thus significantly more complex than creating a Docker Swarm, which only requires installing Docker and running a single command on each networked node. On the other hand, Kubernetes’ complexity can be overcome by acquiring it as a service.

Similar results appear in existing research. Jawarneh et al. conclude that “[i]n terms of functional comparison […] Kubernetes is one of the most complete orchestrators nowa-days on the market” but that “its complex architecture introduces, in some cases, a sig-nificant overhead that may hinder its performances” [24]. This is in line with the results of this thesis, although their study indicates that Docker Swarm does not support remote file system volumes, rolling updates or load balancing [24], which contradicts Docker Swarm’s up-to-date documentation. This is likely due to updates in Docker Swarm since 2019. Truyen et al. state that while Docker Swarm has less features and “is expected to be used and customized for specific technology segments […]”, “[t]he large number of unique features […] is a strong asset of Kubernetes […]” [12]. This also coincides with the results of this thesis, indicating that Kubernetes offers a more complete set of fea-tures, while the functionality of Docker Swarm is intended to be extended by the user.

In document Automatically Scaling a System Across Multiple Servers : A Comparison of Docker Swarm and Kubernetes (sivua 15-23)