Solid DB Migrations Using Flyway
When you’re changing your code, you often need to change your database too, and that can be a tricky business. The tried-and-found-wanting “Bunch of SQL Scripts” approach often works in the short term for small applications, but as time goes on, this approach will become a serious stumbling block as you try to automate deployments and as the number of environments you manage grows.
A problem that needed solving
At Qualica, we manage around 40 environments, all with different versions of the application deployed. Our problem was that we simply could not keep track of what was going on. Our initial solution for managing our database migrations was a home-grown python script. It applied our SQL migration scripts in order and kept track of migration scripts already applied. At first, it showed promise, but it soon gave way to its own set of issues.
Its major flaw was that it did not always detect a failure when applying a migration script, so it would simply keep applying scripts, compounding one failure on top of the other. This would often not be noticed until we got some unexpected behaviour during testing. By then, we would have wasted a lot of time.
The second major problem that arose was that we did not have a way to facilitate more complex migrations. One such situation involved altering the structure of serialised Java objects in a table, which is something that SQL is ill equipped to handle.
At this point, we decided that instead of investing more time and effort into a home-grown solution, we should look at a tool that is already solving this problem in a reliable way. Enter Flyway.
A trusty tool in a time of need
A trusty tool in a time of need
is a migration tool that supports a wide range of relational databases. The core goal of Flyway is to evolve databases reliably and predictably across multiple environments, and to make it simple to do.
Knowing the fundamentals of how Flyway works will help to understand how Flyway achieves this goal. When Flyway is brought into your environment, either by creating the database from scratch or by running the “BASELINE” command on your existing database, Flyway creates its metadata table. In this table it stores, amongst other things, the version number and a checksum of the contents of an applied script.
Flyway: the good parts
There are three aspects about Flyway that I find particularly valuable in building reliability and predictability into our migration process.
The first valuable aspect of Flyway is that running it is idempotent. As mentioned above, Flyway tracks which scripts it applies, so you’ll never end up accidentally applying the same script twice. Closely related to this is the fact that running Flyway is deterministic, because it runs your scripts in order. This means you’ll get the same result from applying your migration scripts every time you run it. This gives you confidence as you take your migrations through your various environments.
The second valuable aspect, and one of the main reasons we adopted Flyway, is that Flyway supports Java-based migration scripts. This is extremely useful when you have a particularly tricky migration that can be simplified by approaching it from an object-oriented point of view. This is how we solved the difficult migration of serialised Java objects. We simply loaded the old objects using the old version of the class, transformed it to the new version and saved it again. If we had stuck with our home-grown tool, we would have had this little sideline migration that we would have had to manage separately. This is not great when your goal is to automate migrations!
The third valuable aspect is how Flyway deals with failure. In the event that a migration script fails, the operations in that migration are rolled back if your database supports it. Flyway will add a failure entry into its metadata table which locks Flyway into a failure state. This is great because it will stop dependent migration scripts from being applied against a database missing some required changes. After ensuring that the cause of the failure and that the changes from the failed script have been rolled back properly, the “REPAIR” command can be used to clear out the failure entry. Now you can retry the migration.
In the event that the contents of a script that’s already been applied are changed, Flyway will not allow migration to continue either. This stops a database from missing out on migration statements added after the fact. After investigating the changes to the script and aligning the database with it, one can run the “REPAIR” command to regenerate the contents checksum of the script. This will allow migrations to continue.
In the wild
When implementing Flyway in your own environments, you would ideally automate your database migration as part of your automated deployments. This means that migration happens in the same way, every time, for every environment, which gives you that sought-after determinism. This is exactly what we did at Qualica. Now, whenever there is a problem with a migration, the deployment is stopped and a notification is sent out informing us of the incident.
The process is working very well for us now, but it did have its teething problems. It was only once we got our automated testing sorted out that things started running smoothly. The key was to move any failure as close to the developers as possible, instead of the process failing during deployment, which would delay and frustrate our testers.
The next step
Once you have all this automated deployment and testing down pat, what’s next? One potential application is embedding Flyway directly into an application. This could be very interesting when one considers desktop and even smartphone applications. These applications could migrate their own local or in-memory databases in the same trustworthy way you would use in your own environments, without having to write any of the supporting code to make it happen.
If you are working with relational databases in your own environments and you don’t have a database migration tool in place, I encourage you to take an hour or two to play with some of the options out there. It’s surprisingly easy to get started and may save you a huge amount of time and frustration.
Leave your ideas, feedback and stories in the comments section below and we’ll have a chat about them.
06 July, 2016