Database schema migration with LiquiBase

One item of concern for me during development is figuring out what to do one I have actual live data on a database, and I can’t do schema updates by dropping all tables and letting hibernate re-create them.

A solution I’ve run into today that seems promising is LiquiBase. It’s still very much in development but they are working on roughly what I want.

The idea behind liquibase is fairly simple – store a list of database changes to a file, and let the user run them. I think it has a way to keep track of which changes were already run in the past, and then it can roll forward any new changesets you add to the file. This is a great idea!

The only trouble I have with it is that I’m using hibernate to generate my database schemas and I’m generally pretending I don’t know anything about the schemas it generates (sequences, foreign keys, foreign key names, not null constraints etc.). LiquiBase generally assumes you know what changes are being made and you’ll be entering them into the changesets file as you go.

LiquiBase does provide a diff tool which allows me to generate XML changesets that would convert one schema to another. I should be able to take that XML diff and “fix it up” to replace drop/create with a rename, or provide a custom data upgrader.

The last bit is one thing that seems promising. I can specify a Java class which is instantiated and used to roll forward or back a changeset. I think this will be something I’d use to manage the “real” schema changes – adding new relationships. Without something like this, there’s no way for me to transform the data properly. So, I can tweak the changesets to prepare my data upgrade, then have it invoke my class, then finish the job. Probably it’ll create new columns and remove constraints first, then run my custom code, then drop columns and add constraints afterwards.

So, a tool with promise. I haven’t managed to use it yet, unfortunately. I’m putting all my tables into a seperate postgres schema and the schema support is temporarily incomplete, which means I’d have to do a bunch of messing around to get it to work easily for me. Once I get closer to the point of really needing it, hopefully that’ll be fixed or I can copy the tables into the public schema temporarily for diffing purposes in order to figure out what hibernate changed.

Ideally there would be a tool I could use to take an existing changeset file and persistence unit (persistence.xml and a bunch of annotated classes) and generate the changes I could put into the LiquiBase changeset file (after fixing up anything not automatically handled by the diffing tool). That way, just before I make a “release” to production, I would run that tool and it would do its magic.

I can think of a way to do that manually now – I could create a temporary database using the old XML, and another temporary database with the hibernate classes, and diff them. Currently my ejb3unit test cases leave behind the latest hibernate-generated database, so I just need a copy of the last released database to do my diff. Then, I drop that into my XML file and my next deploy should upgrade the database automatically. I’d also have to turn off the hibernate’s schema update and only use this scheme to manage the database being used in Java EE, both in development and production, but that’s a good thing since it’ll mean I’m testing the schema changesets during development.

Still, this schema issue is a bit of a pain to work around, so I’ll wait a bit and see if I can get a fix for that before I spend any more time on this.

Advertisements

8 Responses to Database schema migration with LiquiBase

  1. Interesting tool. And in my projects I came to creating the DB-independent backup/restore tool, that backups the entire object model to a backup file and then restores it back, allowing to make a backup, update the software with dropping and re-creating a schema and finally restore data. This also brings another “bonus” – ability to move data across various database vendors easily.

  2. dobes says:

    Well, that seems like a nice approach, but it does mean that you have to perform migrations on your data dump/load schema. However, if the file format is written by hand, that’s not too hard … maybe not as hard as learning and using tools like these.

    Definitely something to think about.

  3. Yes, but we’ve used XStream (initially) to serialize POJOs from model to a our own backup file (coupled with trueZip this brings a solid single backup file transparently). In case, if in a new version of software, the object model changes, for instance, bringing a couple of new fields, this is handled transparently with XStream on de-serialize. Of course, in a complex model changes this requires to be supported in restoration code too, however, this is not very hard to maintain, especially, if model was accurately designed initially (it then does not require a lot of complex changes)

  4. dobes says:

    Thanks for bringing XStream to my attention, I hadn’t seen that before. It looks very, very useful to me right now.

    Ah sweet, they even have annotations! Awesome!

  5. Marlin says:

    We also chose to use hibernate with annotations so, as a small team of Java developers, we could focus on getting a great product out rather than on serializing and deserializing objects. That mostly worked, though Hibernate+Spring+Tapestry had some quirks (still does). As time goes on, we add new features, we refactor, and we have existing customer data to migrate to a new release…

    To use a traditional ETL, we’d have to know more about how Hibernate maps things than we want to (and we already know too much). I didn’t find anything that would help map one object graph to another, so I created my own Java program to read objects from the old database and write them to the new database.

    To do so, the program has to have both the new and the old object definitions around, and hibernate sessions to the old and new databases. To reference the old objects, I renamed the top level package in the old version (com.foo to com.fooV230) and created a library with those classes. The migration (eclipse) project has this library and references the current code base with the new object schema, and the program that reads old objects, copies to new objects, does a little processing, and writes the new ones out to the new database. It’s useful and it’s painful. It’s useful when there is some processing and default-setting to do from one version to another. It’s painful to set up, ensuring that each and every field is copied correctly.

    But once it is set up — preferably before any model changes have been made to the target project — the migration project is kept open by each developer and each developer must make changes to it when he or she changes the object model. If we’ve made hibernate annotation changes for performance or other reasons, the target database receives them transparently. At the end of development, we do some extra testing and clean up, and pass it off to the test team.

    So far, it’s worked, but I’m keeping my eyes open for a better way, or a tool that makes it simpler (requiring less high-level engineering time). I’m interested to hear how you solved your hibernate/annotations data migration…

  6. dobes says:

    Hi Marlin,

    I’ve been using liquibase for my schema migrations with some success now.

    My schema migration process goes like this:

    1. Make changes to the hibernate annotations/schema

    2. Test those changes using ejb3unit against a unit-testing-only database I call “ejb3unit”, which is created by hibernate’s schema export feature in drop-create mode; I configure ejb3unit to use that test database via its properties file.

    3. Generate a liquibase diff from the last deployed database schema and the “ejb3unit” database.

    4. Manually inspect the diff and decide whether anything was done that requires data conversion, and if so, insert custom classes and/or SQL that does the conversion and append that to the liquibase changelog I am using.

    5. Deploy the new app to glassfish on my development machine, which will update the schema based on the new entries in the changelog, and test it manually to be sure the schema is correctly upgraded.

    I feel pretty good about this methodology, I think it is robust because it incorporates unit testing into the process, and it doesn’t require a lot of save/load work to migrate schemas.

    I does require the occasional bit of SQL hacking for a migration, since I can’t use hibernate in the data conversion classes, but most operations are just adding/removing fields.

  7. Alexwebmaster says:

    Hello webmaster
    I would like to share with you a link to your site
    write me here preonrelt@mail.ru

  8. Mark says:

    I recently implemented a JDBC driver wrapper with schema deployment support The driver runs database migration scripts automatically at connect time. Currently the driver supports LiquiBase and my own internal schema manager.
    http://markfarnsworth.wordpress.com/2009/06/27/garin/

%d bloggers like this: