Updating a database schema is pretty easy
if you can take your application offline. You shutdown the application, create a backup of the current database schema, perform allrequired update operations using tools like
Flyway or Liquibase, restart the application and hope that everything works fine. But that changes if your customers don’t accept any downtime. Simple changes, like removing a column or renaming a table, suddenly require a multi-step migration process. The reason for that is that high-available systems make heavy use of redundancy. Hi, I’m Thorben Janssen and in this video, I will show you how to update your database without a downtime. But before we proceed, if you are new here and want to learn how to build incredible efficient persistence layers with JPA and Hibernate and all types of other Java persistence related stuff start now by subscribing and clicking the bell, so you don't miss anything. If you want to build a high-available system, you need to run at least 2 instances of every subsystem. So, in the simplest case, you need at least 2 instances of your application and 2 instances of your database server. The redundancy of all subsystems provides a lot of benefits. The two most important ones are: It increases the number of parallel requests
that your system can handle. and It makes sure that your system is still up and running even if an instance of one of your subsystems isn’t available. But they also create new challenges. I will not dive any deeper into topics like monitoring, tracing, load balancing, and fault tolerance. If you don’t have any experience with high-available
systems, you should read about all of them. The good news is that there are several great
tools and libraries available that help you solve these challenges. In this video, I want to focus on database schema migration for high-available systems. The redundancy of the application plays a critical role during the migration. It enables you to perform a rolling update. The implementation of a rolling update depends
on your technology stack. But the idea is always the same: You have a bunch of instances of a subsystem and you shutdown, update, and restart one instance
after the other. While doing that, you run the old and the new version of your application in parallel. The Kubernetes documentation contains a nice, visual description of the rolling update concept. The rolling update adds a few requirements to your database migration. You no longer need to just adapt the database
in the way it’s required by your application; you also need to do it in a way that the old and the new version of your application can work with the database. That means that all migrations need to be backward-compatible as long as you’re running at least one instance of the old version of your application. But not all operations, for example renaming or removing a column, are backward compatible. These operations require a multi-step process that enables you to perform the migration without breaking your system. Let’s take a closer look at the backward-compatible
operations first. Backward-compatible operations are all operations that change your database in a way that it can be used by the old and the new version of your application. That means that you can execute them during a migration step and don’t need to split them into multiple operations. Adding new tables or views doesn’t affect the old instances of your application. You can perform them without any risk. Just keep in mind that while you’re performing the rolling update, some users might trigger write operations on old application instances. These old instances, obviously, don’t write any data to the new tables. You might need to clean up your data and add
the missing records to the new table after all application instances have been migrated. It can be a little bit harder to add a new column. You don’t need to worry if you add a database column without a not null constraint. In that case, your operation is backward compatible,
and you can simply add the column. That’s not the case for columns with a not
null constraint because it will contain null values for all existing records. You can easily fix that by providing a default value; please check your database documentation on how to do that. If you don’t want to define a default value,
you need to execute 3 statements to add the column with the constraint: First add the column without a default value and update all application instances. second run a database script to fill that field in
all existing records. and third add the not null constraint. The good news is that you can execute all 3 statements within the same migration step. Removing a database column that is neither
accessed by the old nor the new version of your application is also a backward-compatible operation. No application is using that column anymore so there is also no application that could be affected by its removal. The removal of the constraint itself is a backward-compatible operation. The old version of your application can still
write to the database in the same way as it did before. But you need to check if there are any old use case implementations that would break if any database record doesn’t fulfill the
constraint. During the rolling update, instances of thd new version of the application might write some records that do not comply with the no-longer existing constraint. If that breaks any old code, you’re in trouble, and I don’t know any good way to solve it. You can’t remove the constraint because
some reading use cases of the old version will break. You also can’t keep the constraint because some write operations of the new version will fail. Your only option is to remove the constraint
and to roll-out the update quickly. Backward-incompatible operations are the reason why I recorded this video. These are all the operations that change your
database schema in a way that it can no longer be used by the old version of your application. You need to break these operations into a
backward-compatible part which you perform before you update your application and a second part that you execute after you updated all application instances. In most cases, that requires you to add a
new column or table in the first and to remove the old one in a later step. This makes the migration process more complex than it would be if you didn’t perform a rolling, zero-downtime update. To make the migration process easier to execute
and less error-prone, you should use a that performs automatic, version-based database updates. The two most popular ones are Flyway and Liquibase. I wrote a series of tutorials about both of
them and I will add the links to them in the description. And now, let’s take a look at some backward-incompatible operations and how you can split them into parts that don’t break your system. Renaming a column or table or view sounds simple, but it requires 3-4 steps if you want to use a rolling update that doesn’t cause any downtime. The steps needed for all 3 of them are identical. I, therefore, only explain how to rename a
database column. In my experience, this is the most common operation. The migration always follows the same concept, but the implementation differs based on the capabilities of your database. But more about that later. Let’s first take a look at an example. The table review contains the column comment
which I want to rename to message. This requires multiple steps. In the first one, you need to add the database column and initialize it with the data from the old column; then you need to update all application instances before you can remove the old column. Unfortunately, the most complicated part isn’t the database migration itself, and it’s, therefore, not visible on this diagram. The main issues occur during the rolling update, which is between step 1 and the new version. While you’re updating your application instances, you’re running old and new versions of your application in parallel. The old version is still using the old database
column, and the new one is using the new column. So, you need to make sure that both use the same data and that you don’t lose any write operations. There are 2 general ways to achieve that. Option 1: Sync with database triggers The migration process is a little bit easier if your database supports triggers. So let’s start with this one: you add a column with the new name and the same
data type as the old one. You then copy all data from the old column to the new one. You also need to add database triggers to
keep both columns in sync so that neither the old nor the new version of your application works on outdated data. Perform a rolling update of all application instances. and finally you remove the old database column and the database triggers. If you update your database during application startup, step 1 and 2 are executed as 1 step. Option 2: Sync programmatically Some databases don’t support triggers, and you need a different approach. In these cases, you need to perform 4 migration steps, and you might lose some write operations during the update if you don’t switch your application into a read-only mode. Add a column with the new name and the same
data type as the old one. You then copy all data from the old column to the new one. the second step make sure that the new version of your application reads from and writes to the old and the new database column. Let’s call this version new1. Please also keep in mind that there are still old instances of your application that don’t know about the new column and that can write new and update existing records at any time. As your database doesn’t synchronize the write operations, you need to do that in the code of version new1. After you’ve made sure that the new1 version of your application can handle this situation, you can perform a rolling update of all application
instances. step 3. All of your application instances now run version new1 which knows about the new database column. You can now perform a rolling update to application
version new2 database which only uses the new database column. and in the 4th step you remove the old database column. Similar to the previous approach, you can
reduce the number of required steps if you run the database migration during application
startup. In that case, you can execute step 1 and 2 as 1 step. You can change the data type of a column in almost the same way as you rename the column. The only difference is that you also need
to convert all values stored in the old column to the data type of the new column. and I’m sorry to tell you that you can’t remove a column/table/view thats still used by the old version At least not now. You first need to update your application
so that there is no running instance of it that still uses it. After you’ve done that, you can remove the
no longer used column/table/view from your database. Migrating a database schema without downtime is possible, but it often requires a complex, multi-step approach. It requires you to change your database in
a backward-compatible way so that the old and the new version of your application can use it. As you’ve seen in this video, not all migration operations are backward-compatible. But you can split them into multiple steps
so that you can create a database version that can be used by both versions of your application. In most cases, that requires you to add a
new column or table or view which will be used by the new version of your application. After you updated all application instances,
you can then remove the old one. Ok, that's it for today. If you want to learn more about Hibernate, you should join the free Thoughts on Java Library. It gives you free access to a lot of member-only content like a cheat for this video and an ebook about using native queries with JPA and Hibernate. I’ll add the link to it to the video description below. And if you like today’s video, please give it a thumbs up and subscribe below. Bye