Does Kubernetes win the award for the most complicated simplification ever? I state that without even considering the ever-expanding ecosystem that you will need to learn too.
The title of this article encapsulates the paradoxical nature of Kubernetes, one of the leading container orchestration systems which almost singlehandedly revolutionized the way that applications are deployed by orchestrating the simple to deploy but hard to manage Docker containerisation technology. For the application developer Containerisation and Kubernetes has simplified the deployment and scaling of the applications that they have written, provides a standard and consistent API to work against, meaning that they do not have to worry about the underlying infrastructure that their application will run on, be that compute, storage or networking. For the perspective of the developer that deployment of their application is a simple as ABC:
To deploy an app on Kubernetes with the kubectl command (this is an interface into the Kubernetes Control plane), it is as simple as providing the Deployment name, and the Application image location.
kubectl create deployment mySecretApp --image=DockerRepository/ApplicationFolder/mySecretApp:v1
As an abstraction layer it is perfect for the Developer as it allows them to concentrate on the code needed to run the application without having to worry about infrastructure considerations, like storage locations, networking realities, and providing as simple consistent interface to deploy against in the form of a standardised API.
To the Development community, Kubernetes is very much cattle. For a very good article on the Cattle v Pets analogy read this article from a person I respect Randy Bias written in 2016 where he succinctly explains the concept. Application containers are destroyed, created and moved at need, without any interaction from the application, and with little or no interruption is application or service availability. The containers running the application or service are there for a purpose, they do not need care, they are not nurtured, if they die or are sick (under-preforming), they are culled, and a new instance instantiated.
However, this simplicity is not replicated when looking at Kubernetes from the perspective of the infrastructure team. Those that have to manage the cluster or even deploy one in the first place; that said, the second is mainly hidden with you are using the Kubernetes platform provided by the hyper-scaler, for example AKS from Azure or EKS from AWS, find that Kubernetes is one of the most complex pieces of infrastructure in their environment.  Many find that Kubernetes is far removed from the concept of cattle and are rapidly approaching the pets we had prior to the birth of the virtualisation revolution. Having to manage worker nodes, control nodes, monitor performance, deal with worker-node failures, manage pods, namespaces, etc. creating a Kubernetes cluster from scratch, with the necessary resilience and performance is a mystic art worthy of the shamans of old. There are many working parts,
Lets look at the infrastructure parts: a the high level we have a controller node and one of more worker nodes, as scale increases we have controller clusters and the associated complications that adds to a deployment, a vast increase in worker nodes, Kubernetes has to handle networking, node and container naming, traffic flow, load-balancing workloads across worker nodes, whilst still keeping track of service alignment, this leads to an additional layer of objects namespaces, Volumes, pods, replication controllers and replica set, Daemon sets and stateful sets. Etc. and then from the infrastructure/ operational perspective we have not even started with considering the eco system that has built up to aid in supporting Kubernetes environments, Helm Charts which use YAML to aid in automating the installation or upgrade of a cluster, Monitoring addons, addons to scan your environment for vulnerabilities.
So we can see that on the one hand, Kubernetes promises to simplify the process of deploying and scaling applications, abstracting away the underlying infrastructure and providing a consistent API for developers and operators alike. On the other hand, we can see that Kubernetes is horrendously complex, with a steep learning curve and a sprawling ecosystem of tools and services that can be overwhelming even for experienced practitioners.
So, what is Kubernetes, and why has it become such a dominant force in the world of cloud-native computing? At its core, Kubernetes is a platform for managing containerized workloads, allowing developers to deploy and scale applications across a distributed infrastructure. Kubernetes abstracts away the underlying infrastructure, providing a layer of abstraction that allows developers to focus on building applications without worrying about the underlying infrastructure. This abstraction layer is achieved through a combination of declarative configuration files, which describe the desired state of the system, and a set of controllers that continuously monitor the system and ensure that the desired state is maintained. So far so good, but it is still complicated 😊
So why has it become do prevalent in the cloud?
The benefits of Kubernetes are clear: by abstracting away the underlying infrastructure, Kubernetes allows developers to focus on building applications rather than managing infrastructure. This in turn allows organizations to move faster and be more agile, responding to changing market conditions and customer demands with greater speed and flexibility. However, the complexity of Installing, upgrading, monitoring, and scaling Kubernetes can be a barrier to adoption, particularly for smaller organizations or those with limited resources.
Another key challenge with Kubernetes is its ecosystem. While Kubernetes itself provides a powerful set of abstractions for managing containers and if it is properly installed and configured it does a very good job; it is however, only one piece of a larger puzzle. To truly harness the power of Kubernetes, developers and operators need to learn a whole new set of tools and services, from service meshes (Kong-Mesh, Consul, itsio, Linkerd) to monitoring and logging frameworks (Prometheous and Graphana, Splunk, Nagios). The depth and breadth of this ecosystem can easily be overwhelming, particularly for those new to cloud-native computing.
So if it Kubernetes so complicated, how exactly has it managed to gain such a position of control in container orchestration?
The simplest explanation is that it has early adoptor syndrome and a massive community that has built up surrounding it since it was birthed out of Googles internal Borg system. It is opensource, and flexible, it is has a common interface. But that is only part of the answer, for me the reason the Kubernetes has managed to so easily capture the market is the lack of a viable competitor.
OK Who are the competitors and why are they not gaining traction?
There are several potential competitors, legacy container services like AWS ECS, which is AWS’s native container solution which does have some native orchestration built in but it is native to AWS and can only manage AWS services, another is VMware Container Service. In Azure we have Container Apps but that and the associated AKS resource is just a Kubernetes provider under the hood so does not really count, one possible competitor is Docker Swarm, this had potential but being two years too late to the party it failed to garner any real support, further it was seen as a knee-jerk reaction to Kubernetes rather than a viable alternative, the product lacked the ability to auto scale and cannot do rolling updates, meaning that a cluster and the associated application must be stopped for an update, this was a massive gap. Next, up is Nomad from HashiCorp, they have some differences in terms of features. Some of the features that Kubernetes has and Nomad does not include that has hindered it adoption:
- Native support for Windows containers
- Horizontal Pod Autoscaling (HPA) based on metrics
- Built-in support for StatefulSets
- Support for Custom Resource Definitions (CRDs)
- A more extensive set of built-in APIs and resources
- A more established ecosystem of third-party tools and plugins
That being said, Nomad has some unique features of its own, such as its ability to handle a wider range of workloads beyond just containers, it also has a simpler architecture, and more flexible scheduling policies.
The next two competitors are commercial products from VMware and Redhat respectively, VMware Tanzu, and RedHat Openshift; there is a common thread with both products, and this is they are fully featured Kubernetes environments, that ae kept in lockstep with the latest published Kubernetes version, but both wrap an additional layer of enterprise features on top: Tanzu for example give you the ability to support multi-cloud infrastructure and OpenShift as an integrated CI/CD pipeline and integrated monitoring and logging. For the vast majority of environments, there is basically no need for the additional layer of complexity these products offer, so unless that additional complexity offers a reduction is operational overhead, arguably Nomad, Tanzu Grid and OpenShift do and the additional capex and opex costs associated with a commercial package can be justified, these products are also struggling for traction.
This still leaves the complexity issue of Kubernetes
The major problem with Kubernetes is that it’s architecture is designed for scale, it was originally built by Google to manage large clusters at scale. It is highly distributed by design, with microservices at its core. This complexity is not an issue at scale were the benefits outweigh the complexity, again at the small scale, ie a developers local machine or small Test environment, the complexity again can be mitigated by the lack of distribution, it is as we scale into the medium size as application and services grow that the complexity starts to affect performance and resilience. Operations teams are often un-skilled at scaling clusters, they are worried about the skills gap and will push back against implementation.
So, what can be done to simplify Kubernetes and make it more accessible to a wider audience? One approach is to focus on providing simpler abstractions and reducing the complexity of the system. This could involve building higher-level abstractions on top of Kubernetes, such as Helm charts, operators, or service meshes, that encapsulate common patterns and best practices for deploying and managing applications. So basically, further complicate the environment with new additional tooling to simplify Kubernetes, thereby increasing the training requirements of already overloaded operational staff. An alternative it could involve providing more intuitive interfaces for common tasks such as deploying applications or scaling resources, such as Kubectl, Kustomize, or Skaffold, that simplify the configuration and deployment process; again this just repeats the more tools to simplify Kubernetes issue.
For me a better approach is to provide better documentation and training resources, I personally consider myself to be a reasonably technically adept person; but I find the Kubernetes documentation opaque and confusing; like all open-source documentation there appears to be a bias to making heavy assumptions of prior knowledge; this makes the barrier to entry very steep, by making it easier for developers and operators to learn how to use Kubernetes effectively and stopping writing documentation that has massive assumed knowledge gates and by providing more comprehensive (documented for the proverbial old lady on the bus) and properly keeping what documentation, HowTo guides, tutorials, and examples up-to-date, coupled with offering more interactive and hands-on learning opportunities, such as online courses, workshops, or labs.  The implementation and operation of Kubernetes could hopefully become more user-friendly and appealing to a broader range of users.
Summary
I fully expect that this article will be divisive, Kubernetes has followers that are almost fanatic, but I am writing this from the perspective of implementors and operators, not those that have been using and growing with Kubernetes, this technologies ongoing success will be determined by its ability to strike the balance between complexity and simplicity. While Kubernetes has clearly simplified many aspects of cloud-native computing, it remains a complex system that necessitates significant time and resource investment. As a result, the Kubernetes community must continue to innovate and find new methods to simplify the system, while simultaneously recognising the value of its ecosystem and the broader cloud-native landscape.
In conclusion, Kubernetes is both a simplification and a complication. It has simplified many aspects of cloud-native computing, allowing developers to focus on building applications rather than managing infrastructure. However, it is also a complex system that requires significant investment in terms of time and resources. To truly harness the power of Kubernetes, we need to find ways to simplify the system and make it more accessible to a wider audience. Only then can we fully realize the promise of cloud-native computing and build the next generation of scalable and resilient applications.