As a web app in beta, phuser.com is still evolving rapidly. This means that I am often faced with the trauma of migrating a dataset from one version to another, sometimes with substantial changes in the database schema. I’ve developed various tricks for this – some are ugly but effective and involve things like sed, whilst others score more highly on the cool-o-meter. Being the nice fellow that I am I will spare you the former and show you one of the latter.
First thing first. This tip relies on PostgreSQL’s Query Rewrite system (‘Rules’). If you’re using another RDBMS you’ll need to check that it has a similar mechanism. If you’ve avoided this particular piece of Postgres goodness then you are not alone. I steered clear for as long as I could. The manual page makes it sound much trickier than it is. Actually that’s not quite true – I have got into some of my stickiest situations by doing weird stuff with query rewriting. This tip will not be one of those, I promise.
Second things second. I devised this technique myself in a state of some despair during a mammoth DB structural change. I haven’t seen it described before but for all I know it’s as old as some very old hills.
Right, let’s get cooking. This tip helps deal with the situation where a table structure has changed and you need to mangle data as it’s restored from a dump of the old db into the new. Let’s say your old DB has two related tables, and some columns have moved from one to the other in the new version:

This can be tackled using some simple query-rewriting. Here’s how:
- Create an SQL scripts called patch.sql and unpatch.sql. These should be executed before and after the main restore script, respectively. Ideally wrap the whole shooting match in a BEGIN/COMMIT block incase anything screws up.
- In ‘patch.sql’ do the following:
- For each table, write ‘ALTER TABLE’ statements that recreate any columns that are present in the old version but not the new. Don’t worry, these are temporary dummy columns. In our case we would add columns ‘FIELD_Q’ and ‘FIELD_S’ to table 2.
- Now for your rewrite rules. We’ll need a rule which modifies queries which try to INSERT on table2. Something like:
CREATE RULE remap_table_2 AS ON INSERT TO table_2 DO INSTEAD ( UPDATE table_1 SET FIELD_Q = NEW.FIELD_Q, FIELD_S = NEW.FIELD_S WHERE id = NEW.id; INSERT INTO table_2 VALUES (id, FIELD_P, FIELD_R) );
- In ‘unpatch.sql’, simply write statements that DROP the columns you temporarily added (nothing will have actually gone into them). You should also DROP the Rules too.
- There is no step 4.
Using this technique I was able to mangle data in around 15 tables of varying complexity with no drama. It doesn’t suit every migration scenario but in cases like this it’s by far the easiest and safest solution I know.
Happy migrating!





January 27, 2007 at 5:57 pm |
Good tip, thankfully someone else gets to deal with this sort of thing where I work