The big three cloud vendors will have you believe that Multi-Cloud is too difficult, nay impossible. But to many, it’s considered a form of vendor lock-in and similar in terms of behavior to the practices of Microsoft, Oracle and other enterprise software giants of the pre-millennium IT space. One vendor to rule all.
How is the behavior of AWS, GCP or Azure as cloud providers any different to this pre-millennium behavior? You could argue that Microsoft is still in the business of having one ring to rule all with their Azure AD connectors for AWS and GCP, but from the perspective of AWS and GCP, it makes excellent business sense to have their Identity Management to support Active directory because it will enable a seamless and frictionless method to gain secure RBAC based access to their cloud resources.
However, it is not Identity management that is the core blocker for true multi-cloud enablement. It is much more complex an issue than that. At its root, it is an issue with core infrastructure services, differing methods of deploying networks, differing methods to deploy virtual machines, and templating methods. Differing underlying hardware technology, that will provide an unstable performance profile across each deployment.
Like with all projects, there are decisions to make both technical and procedural.
Decisions concerning Governance, security and compliance; All corporations, whether intra-national or multi-national will need to adhere to multiple regulatory requirements like PCI, GDPR, sovereignty data compliance, also non-US based corporations will need to take into consideration US laws like the Patriot and Cloud act that govern the behavior of the big three cloud providers due to their being headquartered in the USA. Also from a governance perspective, each of the cloud providers has different Cloud security and governance policies and differing methods of complying and securing their environments, which leads neatly into our second consideration.
Siloed Vendor Tools: When engaging in a multi-cloud strategy you will find that the common tools that are used to interact with each cloud are siloed within each cloud providers management consoles Cloud monitoring tools are also local to each cloud provider, thus requiring greater knowledge within your operational, deployment and customer support teams. Which again leads to the next issue.
Shortage of skilled staff: the rapidly changing cloud landscape with each of the big three rapidly deploying new services and constantly changing their console layouts means that suitably skilled staff are few and far between and in high demand.
Other issues include:
Lack of App Management: Understanding where your applications are located. This important from the perspective of monitoring and management but also regarding costs, without tight management it is easy to end up with cloud sprawl, this is not a new phenomenon, it used to be called VM sprawl. Unfortunately, unlike local VM sprawl where there was not a massive bottom-line issue, cloud sprawl has a direct impact on the ongoing costs of cloud due to the nature of it being a pay-as-you-consume mode.
It is Spiralling costs that rapidly affect the benefits of cloud migration. We have all heard of the developer or operations engineer who has managed to cost a company thousands of euros/Dollars/Pound/Yen etc, by not destroying or powering off a set of workloads when they are not being utilized.
All these concepts should be considered when managing a single cloud vendor migration, but when you are considering utilizing a multiple cloud consumption model these considerations a exacerbated.
And this is before you have to deal with the technical issues of secure intra-cloud network connectivity, identity management and unstable performance due to differing underlying technologies.
You have decided to go multi-cloud anyway
Despite all the above-mentioned issues, you have decided that due to concerns about vendor lock-in and/or sovereignty issues due to the lack of available regions to provide resilience; you are considering a multi-cloud strategy.
How can we make this easier?
This is where a good architect is worth their weight in gold. They will take these considerations and couple them with concepts like standardization, elevating concepts like management and monitoring to a new level above an individual cloud, I like to consider this the ring of Sauron one ring to rule all.
They will also introduce concepts such as Kubernetes. Standardization on Open Virtualization format for moving templates from cloud to cloud, and converting them to local formats like AMI, ARM or GCP Images. Cloud Objects and resources will be Built programmatically. Terraform your cloud and standardize on landing platforms in each cloud.
What is a common landing platform?
This is not a landing zone in the terms of what the big three consider a landing zone but a common platform of services and by common I mean common across clouds. So Virtual Machines but be careful and used OVF to move common templates across clouds, Kubernetes, again be careful as to use the highest common denominator across clouds. Here GCP tends to lead the way as they are the major contributor to the Kubernetes platform. Use a common database platform, things start to get a little murkier with serverless technologies or functions. The table below shows an example of a set of common services, this is not meant to be a complete list of commonality but an example of what you are looking for when considering a multi-cloud deployment
AWS | Azure | GCP | |
Compute Services | EC2
EKS |
Azure Virtual Machine
AKS |
Compute Engine
GKE |
Storage Services | EBS
S3 (Object storage) S3 Glacier EFS AWS Transfer |
Block Storage
Azure Disks
Azure Blobs Archive Storage Azure Files Azure Migrate |
Block Storage
Persistent Disk Local SSD Object Storage Archival Storage File Services (NFS) Data Transfer Transfer Appliance |
Database Services | Amazon RDS
PostgreSQL MySQL MSSQL |
Azure Database
PostgreSQL MySQL MSSQL |
GCP Database
PostgreSQL MySQL MSSQL |
OK, that looks simple, not really! Other considerations include how exactly are you to slice and dice your Multi-cloud deployment.
What flavor of multi-cloud are you going to adopt?
The first concept of multi-cloud and unfortunately the most common is an arbitrary model, you most likely ended up here because you started moving into AWS, then Microsoft tapped you on the shoulder and gave you some free credits due to your enterprise agreement, so you started to move some things into there, set up some point to point internal VPNs for intracloud connectivity or maybe not and you’re are relying on your on Premises infrastructure to route between your relevant cloud gateways. The result is that you most likely have no real idea of what is where, and no real grasp of what you are paying for the services. This is cannot be considered a planned design.
The second model we can consider is segmented, here unlike with the Arbitrary model, you have started with a plan; product x and y will be deployed on AWS and product Z on Azure or example. The placement decisions being based on workload type (legacy or Cloud-native), data classification level (restricted, classified etc) and the type of application (legacy 3 tier application, data analytics, collaboration software). we have found that this approach tends to slip back to the arbitrary model as the differing CSP’s mature their relevant offerings and the finance department say that cloud A’s bill is too high.
Arguably the first two concepts are not true multi-cloud deployments, just deployments that are using different clouds.
The next three concepts require a bit of joined-up thinking and design considerations before implementation. Our Third concept is Choice.
This model is about minimizing lock-in or the perception of lock-in and being able to deploy workloads freely across cloud providers, usually, it will require additional abstraction layers. This ambition again breaks down into multiple flavors, the less complex and more common case involves the initial placement choice of workload to cloud-platform, with a working assumption that you don’t keep changing your mind. At first glance, this seems no different from the previous concept of segmentation. However, there is a significant difference and that concerns corporate governance and policy constraints to prevent random application movements. With this model companies are free to use CSP services like managed databases
The advantage of this setup is that projects are free to use proprietary cloud services, such as managed databases (depending on their preferred trade-off between avoiding lock-in and minimizing operational overhead). Hence, this setup makes a good initial step for multi-cloud. However, it still has many limitations, not least the fact that you are still beholden to a single cloud providers SLA per application or service deployed
Stage two in true multi-cloud enlightenment is parallel deployment.
Here an enterprise will deploy the same application or service in different clouds to remove the risk of a single cloud provider outage taking out that service, this guarantees a higher level of availability than what could be achieved with a single provider, even when taking into account using multiple availability zones or a single cloud multi-regional approach. Moving to the second layer of enlightenment requires the decoupling of functions such as Identity, deployment automation, and monitoring from a single CSP. This could be handled by embracing opensource components, the decoupling is a quick win with things like common compute and containers/Kubernetes as the ability on how to deploy across multiple clouds is well known and understood, templating and image management, initial compute and docker images built by packer and converted to OVF format to be uploaded into the CSP’s native format. Deployment handled by Terraform and application configuration managed by Ansible, Puppet or Chef. Workflow managed by Jenkins etc. Authentication and Identity management is handled by a cross-cloud identity provider or secret keeper like Hashicorp Vault. With monitoring again OSS is your friend. The downside is that cloud-specific advantages and enhanced features may be lost because as mentioned earlier deployment will be constrained by the highest common denominator of functions per service that can be supported.
It must be noted that the additional flexibility comes at a cost and this cost is added layers of complexity, additional levels of abstraction and new tooling to understand, deploy and manage. Also, this will NOT protect an environment from a miss-deployed or broken application or service. If it is deployed to Cloud A broken and also to Cloud B broken, the result is still a loss of service. Therefore, other application design considerations need to be considered like blue-green deployment policies, to confirm service after upgrade without affecting availability.
The third level of enlightenment is Portability, this is the perceived nirvana.
At this level, you can deploy applications and services in the true Martini manner; any time, any place, anywhere. Here the perceived advantages are easy to grasp, deploying your workloads to whatever environment is best for the service. Deploying to Cloud A and bursting to Cloud B during periods of heavy utilization.
True portability requires a very high level of automation, 80/20 rule will not do, when utilizing the parallel deployment methodology some manual intervention could be tolerated, but true portability requires seamless automation.
There will be a requirement for a multi-cloud abstraction layer or framework, be that a home-rolled one, or a commercially available alternative. This whilst appearing to give you the freedom from CSP lockin does nothing more than change one set of handcuff for another as you are now effectively locked into your multi-cloud provider solution.
Another major constraint or risk to true portability is that these abstractions do not take care of your data needs and as a result, those enterprises that are in the process of attempting to introduce this level of automation and capability to their strategy need to be very cognizant of the costs that relate to egress charges on data leaving Cloud A whilst being synced into Cloud B or C.
Cross-Cloud networking is another major area for concern, globally aware load-balancing, DNS updates to keep potentially changing IP addresses, and static globally addressable names in sync for services that can be randomly accessed from various locations. Keeping data synchronization secure when crossing clouds, Pan-Cloud Identity management. These are the things that will keep your ops staff awake at night.
Summary
The promise of multi-cloud is very enticing; it is the shiny new thing. However, the risks when undertaking a multi-cloud deployment strategy is very much like the adage of boiling a live frog. Throw the frog into a pot of boiling water it will jump out, but gently warm up the water to boiling point and it will happily sit in the water until is it cooked.
Multi-cloud is a deceptively enticing concept. The ability to dynamically move workload across clouds with impunity appears to be the ultimate in flexibility and agility. The reality is that currently (March 2021) the deployment challenges are many and difficult. Whether you roll your own by automating the deployment of landing zones across clouds or use 3rd party products to do so. There are many complex issues to consider and mitigate whilst on the journey down this path. Be prepared for a very expensive trip if you do not strictly define your endpoint. From the perspective of a CSP who resells AWS, Azure and or GCP; cloud tooling to provide generic landing zones in each cloud makes sense. It will streamline your client onboarding with a single interface, however, for the vast majority of enterprises, Multi-Cloud from an IaaS level makes no valid commercial sense. The work needed to provide the perceived benefits could be cost prohibited. Yes, there are outliers and there always will be.
The reality is that Multi-Cloud enablement will in reality only lead to another form a lock-in, be that home-made or from a 3rd party vendor with their Saurons ring to rule all clouds.
The end result is that you are back to the argument of “we want to avoid a situation where we have vendor lock-in.” Perhaps a better multi-cloud position is to use the right cloud for the service being utilized, spend your money on creating cloud-native applications and just properly manage your crown jewels, your data is your companies most important asset but that for another article.