With so many ingredients involved during the data migration process, it’s no wonder that so many projects go over time, over budget and sometimes even stall completely. Gartner even reported that as many as 83% of data migration projects exceed their deadline and/or budget, and sometimes fail completely1. Not the kind of statistic you want to read when you are considering a migration project, but don’t worry, it’s not all doom and gloom.
At PhixFlow, we have helped many customers who have approached us whilst struggling with data migration projects , so we thought we would share some our knowledge to help you get your data migration projects off to a great start.
Every data migration is different, with varying amounts of systems and data to be consolidated and moved. No matter the reason behind your data migration, or the complexity, there are a few basics that you should consider to make the process as smooth as possible.
Whilst all data migrations contain three basic stages, Extract, Transform and Load (ETL), it is important that you have a solid migration plan and process to follow to ensure project success. We have identified six key steps in the data migration process as set out below.
1. Identify sources of data
Before any migration should start it is important to understand the legacy data. More specifically, you need to understand what data is needed, where the data is, what format it is in and how it will be accessed.
For example, in more complex migrations, there may be multiple sources of data. Some data will be stored in databases, excel spreadsheets and other systems. Having the knowledge of what data is where, and how to access it is an important first step in the migration process.
If you are using a platform, such as PhixFlow, you can avoid being too detailed at this stage as the iterative process means that you will be able to identify any gaps in data and adjust your data queries accordingly as the project gathers momentum.
2. Identify the filtering rules
Quite often you will find that there is a lot of legacy data that isn’t required, so it’s important to identify this data early on as it can speed up processing times when you are running simulations.
For example, when migrating data from a financial system it may be decided that old purchase orders are no longer required. All sounds good, but there can be instances where companies create ‘open’ purchase orders that remain active for many years. This would mean that the open purchase order would get missed based on its creation date, even though it would still be required.
There may also be compliance issues that need considering. In some instances, companies are required to keep records that date back for a certain amount of time.
The earlier on that you can identify the data you need, and do not need, the better. By eliminating the data that is not required you will be able to process test data much faster and speed up the process of identifying errors.
3. Identify the target objects and transformations
Obviously, no data migration is complete without the ‘Load’ phase, i.e. where the data is transferred into your new system. This can only happen if you know where your consolidated and enhanced data will end up.
Create a list of target objects and then load them into your migration tool. This will allow you to use some sample data to run a test migration and see what errors occur. Using this error list will help you to identify areas that are missing data or show you which data cannot be mapped to the target objects fully.
4. Identify data that needs cleansing
Ok, so you’ve got this far. So far, so good. Now that you have identified what data you need and where you are going to get the data from, and where the data is going to be transferred to, you will need to look for data that needs enhancing.
Whilst performing data migrations for our customers we have seen many examples where the original data is not suitable for the target migration. One such example relates to the migration of purchase orders to a new financial system. In the legacy system the line items were manually typed in without the need for item codes, but the new system required each item to have an item code. To solve a problem like this you will need to consider what to do with these purchase orders.
In another example we saw that the legacy system allowed the manual input for the country on a customer record, which meant that there were multiple spellings of the same country. The target system used dropdown selectors for the country, which meant that the data failed during the migration.
To help resolve issues like the above, it’s very useful to have a tool that enables “fuzzy logic” rules to be added iteratively to cleanse your data before the data transformation steps in your migration.
As mentioned previously, you don’t need to create an exhaustive list of data cleansing requirements at the very start of the project, as the iterative testing process will inevitably uncover more data that needs attention. It’s always a good idea to document the that you think of in the first instance as you can build these into your initial workflow.
5. Identify business rules and data validation requirements
Now, your shiny new system may require some new business rules and validation before you can successfully migrate your data, but what does this mean? What are business rules and validation rules?
Business rules are the criteria that your business will set for certain scenarios. For example, it may be imperative that all customer records must have both a mailing address and billing address.
Validation is what needs to happen with the data to ensure that it is correct. A good example of this is when transferring bank account details for suppliers. Your new finance system may specify that the bank account number and sort code need validating. This will cause problems if the information held in legacy systems is outdated or has been typed in incorrectly meaning that the new system will reject the data on import.
6. Rapid Iterative Testing
The quickest way to identify errors in your data migration project is to test, test and test again! When you have vast amounts of data to migrate testing the migration with a full set of data will take more time to process, so it is a good idea to start with smaller ‘chunks’ of data.
We would recommend that you start with about 10% of the data that needs migrating. Why such a small amount, you may ask? What we have found is that running a sample size of 10% will find approximately 80% of the data issues and rules that will need addressing. This smaller amount of data is much faster to process which is especially useful when you make some changes to the rules and wish to run the sample for a second time.
Once you have ironed out the issues with this smaller sample, you can move to a larger sample size, say 20% of the data. By gradually increasing the amount of data that you process at each stage, you are minimising the delays associated with processing large amounts of data, whilst also keeping the amount of corrections at a manageable level.
The final stage before pressing the big red button to migrate all your data is to run the migration through a final reconciliation test. This final step allows you to double check that you are getting the expected amounts of data in the destination system. This could be checking that you have the correct number of customers and suppliers transferred.
Reconciliations can verify basic properties e.g. that the number of accounts migrated equals the number of source accounts minus any accounts that have been deliberately filtered out at each stage of the migration. We prefer more detailed, item-by-item reconciliations with drilldowns to mismatches and the lists of items not transferred (with reasons given). This approach makes it much easier to gain business confidence in the process and enables a quicker signoff at go-live.
Very rarely is a data migration a ‘just’ job, where data is simply taken from system A and placed in system B. More often, there will be multiple issues with data quality, validation rules and logic that has to be built in top make the transfer as smooth as possible.
The best way to handle a data migration is to automate as much as is possible. Using a data migration tool such as PhixFlow will allow you to approach the migration in an iterative manner, resolving data issues and validation rules along the way, which is more efficient than trying to process all of you data in one attempt.
For more information of how PhixFlow can assist you in preparing and conducting a data migration project please contact us.
Preview the power of PhixFlow
Find out how PhixFlow can help you deliver a successful data migration project – on time and on budget.Click here to request more information