Automatic Data Migration Testing: Empowering Testers with Hudson

Automatic Data Migration Testing: Empowering Testers with Hudson


There have been many good examples of using Hudson for cross-platform builds and automatic execution of tests, but Hudson also provides a great environment for empowering non-developers to execute particular tests whenever they want. We have found this to be particularly the case when automating data migration tests.

This article will discuss the what automatic data migration testing is, and how Hudson can make it easier. Whilst it refers to Hudson, the same techniques could also be used with Jenkins.

About automated data migration testing

Data migration testing is usually quite complicated and time-consuming. This is because you need to make sure all the migrated records are checked for accuracy from the source database to the destination database. It not only require testers to identify how the migrated records are mapped from the source to the destination, but also to perform checking on the values of the records.

The complexity of data migration testing increases when the testing needs to be done on huge amount of data and the mapping from the source to the the destination is quite complicated.

In order to reduce the complexity of data migration testing and achieve better testing performance, it becomes necessary to apply automatic verification. In some cases manual testing can be completely superseded by automatic testing.

A simple example

The following diagram illustrates a simple method for automating data migration testing that we have taken in the past:

We see that testers only need to provide test inputs, each of which specifies:

  • the table(s) that the data is being sourced from
  • the table(s) that the data is being sent to
  • how the migrated data is mapped from the source to the destination

Inputs can be things like custom JMeter Test Scripts or Excel Spreadsheets.

Once the test inputs have been prepared, an automatic data migration testing program can load them and perform checking on the data from the source database to the target database.

After the testing is completed, the program will generate the test reports which can be either XML or HTML format.

Limitations of a stand-alone approach

Whilst it provides a good baseline to work from, the aforementioned approach is not particular user-friendly:

  1. Test programs are usually executed from the command-line, meaning each test execution needs to be triggered on a separate terminal. Therefore, multiple terminals will be required to execute a number of different tests if test output for each test execution needs to be monitored and kept.
  2. Testers need to tell the program which test is to be executed, and the environment that the test is to be executed against. The execution can become quite complex when there are a lot potential tests to choose from, or a number of different environments.
  3. During the process of data migration, there can be frequent changes to data mapping rules, which requires testers to frequently edit their existing test inputs accordingly, or to add new test inputs

Fortunately, there is a tool that helps with some of these problems: the Hudson continuous integration server.

How Hudson helps

By using Hudson to run tests, users get an out-of-the-box interface to trigger tests, monitor test progress, and track test results. Specifically:

  1. Testers do not have to execute tests from the command-line
  2. Testers can specify parameters and easily pass them to test programs; for example: the test input to be executed and the environment that the test is to be executed against
  3. Multiple test executions can be triggered simultaneously
  4. Test reports and console output are well maintained
  5. Test inputs can be managed and fetched from a central repository
  6. Specific descriptions can be generated for each test execution, which provides better maintenance and tracking
  7. Each tester may log onto Hudson with a specific user name and password, which helps identify who ran each test

Setting it up

So how does one go about setting up such a project? Firstly, a Hudson project needs to be created. The project can be created by simply clicking on “New Job” on the Hudson homepage, and provide the project name and project type:

Click “OK” button to save the project, and the project will be shown on the Hudson homepage:

We can then click through to the project, and click the ‘Configure’ link. In order to allow testers to specify the test to be run and the database environment the test is to be run against, the project needs to have two input parameters defined for it:

Now let’s specify what the build actually executes. Normally, this would be a complex Ant script. For the purposes of simplicity, we’ll just specify a dummy shell script:

In this case, the script will:

  1. Check if the specified input test file exists. If the input test file does not exist, then it will exit the current build job gracefully; otherwise it will proceed the build job.
  2. Check if the test reports have been generated successfully for the current build job.
  3. At the end of each build job, a custom build description will be set which will clearly tell which test was executed on which test environment – for example “Execution of test_input1.xlsx on Source Database 2”.

Finally, as test reports need to be kept for for tracking purposes, we define them as build artifacts:

Taking it for a spin

To run this build:

  1. Click ‘Build Now’ for the ‘Automatic Data Migration Testing’ project
  2. Select a specific test input to run:
  3. Select the test environment to execute the test input against:
  4. Click ‘Build’ to trigger the build to start executing the test input.
  5. Track test execution progress by accessing the build console output:

Once the build job is completed, testers can:

  • Check the test reports by accessing the build artifacts:
  • Use the custom build description to keep track of what this build did:
    In this case, the test input file is “test_input1.xlsx” and the test environment is “Source Database 3”; therefore the build description will be set as “Execution of test_input1.xlsx on Source Database 3”.
  • Review the build console output


We’ve seen how Hudson is a great environment for empowering testers to run their own data migration tests. It does this by:

  • Managing test inputs and environments
  • Executing single or multiple tests
  • Tracking test executions
  • Storing test reports

It’s a tribute to the original Hudson developers that it can be used in new and interesting ways.

No Comments

Leave a Reply