Data Migration and Import Examples

      3 Comments on Data Migration and Import Examples

This article is intended as an accompaniment to the podcast of the same name, it is a simple walkthrough of some examples of the techniques mentioned in the podcast. It is much easier to visualise queries and data structures if written down, so here we look at an example of each of the points mentioned. I’ve taken a classic example of an employees table which needs to be populated with the clients existing data, which arrives in a different format. Its a common scenario and the ideas illustrated here are also applicable to many other situations and types of data.

(Data generated by

Here’s our incoming data structure:

The target system looks like this:

Taking each of the incoming fields in turn, we need to:

  • id becomes employee_id and the datatype changes slightly
  • Name gets split up into first_name and last_name
  • Dept needs to get pulled out into a lookup table and the the department_id pushed back into this table
  • Join_date needs its string format parsing and turning into a lookup
  • Email and comments columns don’t need changing

Converting Data Type – Employee ID

This is pretty simple, just add a new column onto the Employees table and then populate it.

Splitting Strings – Name Field

Happily the generated data that I used has only first name and then surname, reliably separated by a space. If your incoming data is in a clean format like this then you are lucky – it can be much more complicated especially if a full name contains more than two words, deciding which name the middle word belongs to can be a nightmare. To get the name separated is awkward as mysql doesn’t have split() but I started off with this select statement:

The abbreviated output saves me waiting for a lot of records to be processed when really I just need a small sample to check my formatting is correct. This looks spot on so the select statement can be adapted to make an update statement after we’ve created our new columns, as shown below:

String Dates to Real Dates

We’ve already got a column called join_date in our Employees table so the new one needs a new name:

Again a select statement first, and then converting to an update statement:

MySQL will default the time element of the timestamp to midnight on the relevant day.

Enum (or almost-enum) data becomes Lookup Table

First of all, let’s get a grip on what values are actually in this table to start with.

Looking at this table we’ve got a few things gone awry – “LegalPayroll” should be “Legal and Payroll”, and we’ve got “Accounts” and “Accounting” plus “Quality Assurance” and “QA”, but these are actually the same department. With such a small dataset it would be as well to manually create the lookup table we need, and I’m going to use numeric indexes for mine.

Next we create the new column on the old Employees table and then populate it with the “easy” entries.

Looking at the records that didn’t get updated, we still have a few things to fix:

The Final Step

Finally, we populate the main employee table with all our newly-prepared and reformatted data. Here’s the query:

About Cal Evans

Many moons ago, at the tender age of 14, Cal touched his first computer. (We're using the term "computer" loosely here, it was a TRS-80 Model 1) Since then his life has never been the same. He graduated from TRS-80s to Commodores and eventually to IBM PCs.   For the past 10 years, Cal has worked with PHP and MySQL on Linux OSX, and when necessary, Windows. He has built on a variety of projects ranging in size from simple web pages to multi-million dollar web applications. When not banging his head on his monitor, attempting a blood sacrifice to get a particular piece of code working, he enjoys building and managing development teams using his widely imitated but never patented management style of "management by wandering around". Cal is happily married to wife 1.33, the lovely and talented Kathy. Together they have 2 kids who were both bright enough not to pursue a career in IT. Cal blogs at and is the founder and host of Nomad PHP