As time goes by, applications get more complex, with more and more new features. This means the data models and the original architecture become outdated and need to be refactored.

Refactoring a relational database is complicated, particularly when structural changes involve migrating existing data.

Here’s how my team did it and the lessons we learned.


The project at hand required us to merge the table representations of two different types of objects: booking modifications (such as date changes or passenger additions) and the current state of booking data.
Denormalization allowed us to simplify our codebase by a great deal without having a strong negative impact to our performance. However there were some caveats,

  • Booking data in more than 40 pairs of tables needed to be merged to new, unified tables.
  • The site had to be up and running at the same time the migration was happening. For a a top ecommerce platform used worldwide, each minute down cost big money in bookings.
  • We had 10 years worth of data, which meant millions of rows needed to be copied over.

As we discovered exactly what we had gotten into, we came up with a plan to migrate the data safely.

The Master Plan

We had to do the refactoring while being online the whole time,

Creating the new tables with the desired schema
This step was the easiest. We even used the opportunity to standardize naming for tables and columns.
Copying rows to new tables
We chose to have a background task that ran once in awhile and copied data in small batches, as to avoid putting too much pressure on the database servers
Changing write and read paths
Updating references to old tables in your codebase can be tiresome and, depending on your release process, a bit tricky. For us, it meant that during the code release, the app would reference both the old and new table names. We decided to use views and triggers to redirect all reads and writes to the appropriate table.
Testing, testing, testing
I cannot emphasize enough the importance of testing. We used a combination of database diffs, and unit and automation tests to verify data consistency as much as possible.
Getting rid of old data
Now data in the old tables can be safely disregarded. However, since very delicate data operations took place, we decided to still keep the data around for a bit just in case we need to recover any piece of the original bookings.

Lessons Learned

  • Things can and will go wrong in production. Prepare for faster recovery as well as fail-proofing your implementation.
  • All the logs in the world won’t matter if you don’t monitor or have any alerting system working.
  • Communication is key. Make sure all important parties are aware of your refactor, so that bugs are identified and fixed quicker.
  • Incremental changes are better than huge, irreversible ones.
  • Don’t be afraid of stopping the show. While deadlines are important, you also want to invest in thorough testing before your code touches critical data.


Database refactoring is often complicated. Each individual case is different, but I hope you now have a better idea of how to approach it. Be prepared to make hard decisions and remember that baby steps are safer.

Have you done a similar refactor before? Got ideas on how to do it better? Let me know!