Teams across all market industries have adopted a DevOps way of working to release new versions of their applications. Pushing application source code through your CI/CD pipeline which leads to running applications (components) in production is carried out without much hassle. Contrary to this are database (related) changes that have a significant effect on the applications referenced earlier. Often this leads to the slowdown of new releases. You might need help here. Database changes and DevOps: key considerations to keep in mind.
First, it’s essential to understand why database-related changes often do not follow the DevOps approach. Just three reasons which are relevant in a lot of organizations.
One of the key reasons is the abundance of database administrators in the (architectural) design decisions. Developers have the freedom to execute DB changes such as adding tables, creating table indexes, or stored procedures in Development and Test environments. Until they reach Acceptance and Production environments, they need to involve database administrators. So they’re easily overlooked.
Keeping data consistent, reliable, and secure is a specialty that developers often don’t have. Therefore, DBAs require different skills which developers lack. On top of that, Most of the developers want to focus on business logic instead of DB changes. Often developers have some fear to change (the structure) of a database since every change can have a significant impact on the database itself or the data which is stored.
This combination can lead to errors in the later stages of the software development life cycle. Errors that are difficult and expensive to fix.
All of these arguments do not contribute to an easy adoption path for DB changes in the DevOps cycle.
Changes to databases pose unique challenges to organizations that need to be addressed before they can be included in the DevOps workflow.
First of all, database changes are all cumulative. That means that all changes follow each other in a certain order. They all depend on each other and mixing these up will definitively break things or make your data inconsistent. This has a direct impact on problem handling and error handling when it comes to changes.
Before-mentioned changes, if carried out manually result in a lack of visibility and traceability of database changes. That brings us closer to a not-wanted scenario like rollbacks.
Databases are made out of a decent structure (except for NoSQL databases) and logic but also the actual data. This makes it very hard or even impossible to “reverse” a change since the order in which data is entered into the database matters.
Rollback plans are pretty common in every organization. Those can be seen as “escapes” when something goes wrong. However, that has a negative impact on the DevOps way of working in which teams strive for continuous delivery:
- Rolling back a large database costs a lot of time, especially when the amount of data is massive and grows steadily every time. Rollbacks are “reactive” instead of proactive and are seen as an exception-based approach to problem-solving.
- No one wants (a lot of) downtime since that means a loss to the business
- It creates a culture that conflicts with the DevOps practices like pushing small increments and small fixes in case something goes wrong.
As soon as your organization overcomes the above-mentioned challenges, it’s good to take a look at typical considerations to actually implement actual DevOps-related activities.
There should be no debate about storing all DB-related configuration and deployment scripts in your source code versioning tool. This also applies to patches, sample queries as well as scripts that alter the structure of your database.
Two approaches to managing change
Database changes can be described in two flavors: state-based and migrations based. The declarative or desired state approach describes how the database looks like when you would design it from scratch. Just as an example, it means: no “alter table” statements to change the number of columns or indexes of a table. A migrations-based approach does actually describe the opposite: it described how the database structure should be changed according to a certain previous state.
Both approaches have pros and cons such as the following:
- The migrations-based approach results in scripts that can be written early on in the SDLC and thus can also be reviewed by team members before they are executed. A consequence of this approach is that the person who creates the SQL script requires an expert view of the current context and status of the applications as well as the database itself.
- Describing the desired end-state is faster than the migrations state since you don’t need to take into account the current context. You can construct scripts following best practices or you can let a framework create the scripts for you. However, the delta script is generated on the delivery time so you need to manually review it when comparing the version of the script in the source code against the target database.
Keep in mind that mixing these approaches up makes your CI/CD pipeline very hard to manage. The best is to stick to either one of them to focus on a single solution.
As with software application changes, you need to keep DB changes small. Not to manage them better, since that might be harder, but to keep an eye on unexpected
Flyway is a type of database that allows you to build and execute small increments which follow each other in time. It helps you to keep your database consistent. A simple role of thumb to keep increments small is the following:
- Put one script in a single file which is executed in isolation
- Only include a single operation that is carried out
- On a single object
This prevents too many complicated scripts which touch various objects that all execute operations either sequentially or even in parallel. Debugging and troubleshooting is much easier when things go as planned.
After these considerations, it’s time to focus on CI and CD. Database changes should be integrated and validated through testing. Two key questions arise: what (which unit or component) should be tested during which stage of the CI/CD pipeline? Smaller batches can help answer those questions. Continuous Delivery helps to push out changes to target environments with ease. Problems can arise if you have not considered the following:
- Which applications are affected when the database (structure) changes? Does that have an impact on the functionality of the connected application? What about response times, processing times, latency, etc?
- Will there be any downtime to the database or can you roll out your new release without any downtime at all? What will be the effect of downtime on the other applications? Will they fail or gradually handle the downtime in such a way that end-users won’t notice it?
- How much time is needed to recover from an unexpected failure? This should be known up-front to make an informed decision in case of risky scripts.
- Suppose the database change affects the response messages of the API which is exposed to other services. Is everyone informed in case of a breaking change?
As seen from these considerations, small increments help to keep things under control.
CI/CD pipeline deployment scenarios
So far so good, the next step is the deployment scenario for the database as well as for the application itself. Both require 3 main steps to roll out a new increment to production:
- The actual source code in the corresponding Git repository
- Continuous Integration steps such as compiling the application itself and testing it as well as validating the SQL scripts.
- Actually, deliver the changed application and its belonging database
Those three steps can be found in the following four scenarios which gradually climb higher on the DevOps ladder. Later scenarios provide a closer gap between the theory of DevOps and how you actually make things practical.
Scenario 1: separated teams and independent pipelines
Organizations that just have started with these topics start with independent pipelines for both the application and the database itself. The application team and the DBAs do not work in tandem but maintain their very own pipelines. The biggest benefit in this scenario is the clear separation of concerns in terms of technical aspects. Yet those teams need to communicate a lot with each other to keep all in sync. Since this scenario is not ideal it’s only beneficial in the case of a completely independent database that does not directly interact with a connected application.
Scenario 2: deliver the database and application in one go
Both the database and the application have their own source code repositories and their integration paths are isolated from each other until both reach the delivery phase. They share the same Continuous Delivery process, this acts as the contact point between them. There is no such thing as synchronizing them together, only new changes will be delivered in one go.
Scenario 3: build and deliver together
This is an interesting “intermediate step” just before an organization reaches the highest level of integration. The CI process is shared to some extent, while the source code still lives in distinct repositories. An example would be to only build, integrate and deliver a new version of the application as soon as the DB change has been verified and the CI process finishes successfully. It requires a positive outcome for both parties to guarantee a reliable change.
Scenario 4: application and database in a single pipeline
At the most sophisticated level, you can deliver either a new version of the database or a new version of the application, or even both. There is a constant synchronization of both aspects and there is only one pipeline for the CI as well as the CD phase. This scenario fully utilizes the opportunity DevOps gives to Developers and (in this case) DBAs as the (System)Operators – thus completely following the core principles to connect Dev and Ops to build and maintain reliable applications.
In this article, you’ve read a number of key considerations and practical aspects to integrate database-related changes into the DevOps cycle. There are some key differences to handle DB changes compared to application-related changes but there are also similarities that are basically easy to incorporate into your current way of working. Four different scenarios of CI/CD pipelines help you to gradually automate the level of synchronization between two aspects.
Some final notes to boost these initiatives are the increased levels of knowledge between the developer teams and the database administrators as well as a mutual understanding of how to apply these practices into reality. It increases the feedback on how the application and database work together since they are closely related to each other in 9 out of 10 cases. The more successful changes, the more confidence people will get to actually change the systems which propel new changes thus the delivery of new business features.