What is a Service Mesh? I said Mesh, not mess.

One thing that is constant with IT is that change is the norm and buzzword bingo is a game we play every meeting. The phrase of the day is Service Mesh, you know me I love a definition, so according to Gartner a service mesh is:

A service mesh is distributed middleware that optimizes communications between application services, especially on microservices.

Well, that clears that up perfectly, sounds like RabbitMQ. And on a very-very high level it is a good analogy. The sort of level that says Earth and Mars are planets because they revolve around a star.

Our purpose with this article is to have a closer look at Service Mesh and how it relates to Service Discovery and API gateways; coupled with the associated problems that a service mesh is designed to help solve.

What is a Service Mesh?

OK time for a better definition, as once again Gartner fails to deliver a coherent answer, it is time to remove that veil of obscurity.

A service mesh is a dedicated infrastructure layer that uses a proxy to facilitate service-to-service communications between services or microservices. They are fully configurable with low latency and designed to handle a very high value of traffic.

This dedicated communication layer can provide various benefits, including observability into communications flows, secure connectivity to services and resources, and providing resilience by automating retries and back-off for failed requests.

The architecture is made up of network proxies that are paired with each service in an application, together with a set of task management processes. In a deployment these proxies are referred to as the data plane, and the management processes are referred to as the control plane. The data plane intercepts and “processes” calls between different services; the control plane is the mesh’s brain, coordinating the behaviour of proxies and providing APIs for operations and maintenance personnel to manipulate and observe the entire network.

High level architectural diagram of a Service Mesh

We first investigated the concept in this article were we introduced Hashicorp’s Consul service on Azure, however as this is an introductory article into how to deploy Consul and create a mesh for your environment, a little background information on the what, why and when is in order. So with out further ado off we go on why it is different to Service Discovery and an API Gateway.

So what is different between a Service Mesh, Service Discovery and an API Gateway.

Before we can highlight the differences, we need to define what exactly service discovery and API gateways are.

Service Discovery is the process of automatically determining which instances of a service satisfy a given query. A service discovery process once invoked will return a list of suitable servers.

In more distributed environments, the task becomes more complex, and services that could previously rely on DNS lookups to find dependencies must now deal with client-side load-balancing, multiple different environments (e.g. staging vs. production), geographically distributed servers, and so on.

As a result, where a developer previously relied only a single line of code to resolve hostnames, their services will now require many lines to deal with various outlier cases introduced by the higher distribution of environments.

Now looing at the API Gateway architecture it looks very similar in design to the architecture of a Service Mesh. Like the Service Mesh an API Gateway is effectively a reverse proxy, coupled with a governance and policy wrapper which is used to provide a consistent interface to underlying APIs and is often placed on a corporate edge to provide ReSTful access to back-end devices.

As can be seen both API gateways and Service Discovery have some of the constructs of a Service Mesh and it is due to this that the concept of Service Mesh causes so much confusion to businesses. Simply put, it is effectively a merging of the API gateway, and Service Discovery coupled with an couple of other more nebulous services that are dependant of the chosen provider.

So that is a Service Mesh, How does it work?

As previously stated, the core architectural components are the various constituent data planes and a control plane. To summarise, the data plane is responsible for tasks such as health checking, routing, authentication, and authorization. All network packets sent and received by service instances are translated, forwarded, and monitored. The control plane provides policy and configuration for all of the data planes in the mesh. Unlike the data planes, the control plane doesn’t interact with any packets or requests in the system. It can, however, combine all data planes into a single distributed system.

The control plane of a service mesh is typically operated by humans via a command-line interface (CLI), web portal, or other type of user interface.

What is a sidecar proxy?

Service Mesh - What is a side car — Yes a Service Mesh has a side car, just not this sort.

The service is, at its core, a mesh of network proxies. Sidecar proxies are extra containers that proxy all connections to the containers where the services reside, such as in a container orchestrator like Kubernetes, and are used by app development teams to implement the service mesh.

As the name implies, a sidecar proxy in a service mesh runs alongside a service or instance, such as a Kubernetes pod. Sidecar proxies enforce policies and collect telemetry in the data plane. They are capable of dealing with inter-microservice communication, monitoring, and security issues.

What are the Sidecar benefits for developers?

Sidecar proxies enable developers to concentrate on developing, supporting, and maintaining microservice application code while assisting operations teams with application running and service mesh maintenance. To summarise, the service mesh enables these teams to decouple their work.

A container orchestration framework, such as Kubernetes, can be used to manage all of the sidecar proxies, and this orchestration becomes more important as the Service Mesh takes on more tasks as the application’s infrastructure grows.

So that is how it works, but what can we use if for?

Apart from the use cases associated with Service Discovery or the API gateways a Service Mesh opens up several other potentially very interesting options. Traditional software programmes function in the following manner: a client transmits HTTP requests and receives the response from a web server. This server, in turn, communicates with another server that handles applications and possibly a database. However, if a company needs to update or improve any of the app’s functions, they are required to upgrade the entire application, resulting is heavy service interruptions.

When using a microservice architecture, each function of an application will run in its own separate and discrete container. Because the network serves in a similar capacity to an operating system, it is important that these containers be able to communicate with one another. When using microservices, it is possible to roll out new features or perform upgrades progressively and methodically, rather than having to change the entire application all at once. This is one of the advantages of using microservices.

When it comes to application developers, one of the most significant advantages of a service-mesh architecture is that it offers teams greater freedom in terms of how and when they test and release new functions or services. For instance, as will be shown in the next explanation, canary deployments, A/B testing, and green-blue testing are all supported by service mesh.

Canary releases and deployments

Application developers are able to put a new version of code into production by utilising a process known as a canary deployment or canary rollout. This allows them to transfer a proportion of users to the new version while the remaining users continue to use the previous version.

For instance, developers may start by introducing only 10% of a new service and then rely on the service mesh to validate that the service is functioning as expected before extending the service further. Using this strategy, the developers will be able to reliably bring additional components of the upgrade into the service mesh as time passes.

Tests using A and B

Additionally, enterprises are able to test out new concepts via the service mesh. Take, for instance, the case of a business that is interested in implementing a new web design for their Black Friday promotion. It is able to test the new design months in advance with the help of the service mesh in order to collect feedback from users. The store might give the new layout a try with five percent of their customer base first, and if it proves successful, they might offer it to the rest of their clientele on Black Friday.

Green-blue tests

The green-blue testing process, which is an engineering procedure for testing new versions of a service, can also be carried out by developers with the assistance of the service mesh. It requires running two production environments that are similar to one another, as well as monitoring for faults and unwelcome changes in the behaviour of users as more traffic is shifted from the older version of the service to the new one.

What are options for building a service mesh?

When it comes to constructing a service mesh, businesses may think about utilising any one of a number of well-known offerings that are also open-source software solutions. Consul, Kong, AWS App Mesh, and Istio are a few examples of well-established alternatives that are now available on the market.

Summary

You should now, hopefully, have the fundamental understanding of what a Service Mesh is, how it works, and what are the use-cases that should be looked at as a reason for starting a Service Mesh transformation project. Hopefully, you have this understanding. Although the architecture of the product is still in its infancy stages, the community that revolves around the functionality is quite active, and the capabilities of the product are advancing at a quick pace. In the next post, we will investigate the process of establishing a service mesh utilising Hashicorp’s Consul solution.

The final message is that this is a field that should be monitored closely, as there are many different use cases that are compelling.

If you have questions related to this topic, feel free to book a meeting with one of our solutions experts, mail to sales@amazic.com.