When your customer decides that his legacy CRM system is junk, and he has to move all business data to Microsoft Dynamics CRM with your help, he doesn’t expect “junk in - junk out” effect.
What are the critical success factors and the things which need special control during migration of business data to Dynamics CRM?
See further – most of them are simple but often overlooked. And missing it may lead to stopping the migration, taking “ad hack” solutions, and making the migrated data even more junk than the original ones.
Good detailed plan
As for any activity with many details and dependencies, make detailed checklist of actions to perform and things to control on preprocessing, migration and postprocessing. Use top-to-bottom approach to plan actions: start with the most general and then bring more details to each level. Identify dependencies and put the actions in correct order.
Preparation on staging environment
Use staging Dynamics CRM environment to prepare the migration. When you (and your customer) are sure that all data are correctly migrated to the staging environment, do the migration to live. Backup & restore of the CRM organization database is not always an option: the restored CRM environment may not start at all. Backup & restore is not applicable at all if staging and live environments (and consequently Active Directory user accounts) are located in different domains.
Keep the customizations of your staging environment identical to live as much as possible. Do not make parallel customizations in live and staging UI, even if they look identical. After such changes, further export and import of customizations between live and staging may become impossible.
No orphan records
In legacy CRM applications, records may have blank or invalid ID’s of related records. This can crash the software you use (for example, upgrade of ACT! database from an older format before migrating to MS CRM).
If possible, identify and fix or delete orphan records at early stage. In old ACT! and GoldMine databases you can do it on back end in dBase files.
Permissions
Migration tools and SDK’s may require that all users, active and historical, have permissions to create, append, append to, share,… If so, before the migration create all active and historical users in Active Directory and CRM and assign System Administrator role to them. After the migration secure CRM by removal of System Administrator role. Then you can assign all records owned by historical users to actual users, and only then disable historical user accounts.
Control size of fields
Control that all field values fit into size limits of CRM fields. Update size limits in CRM if needed. But keep in mind that updating sizes of standard fields (like address1_line1) may not solve the problem. Instead of size limit error, the import job can produce an unspecified server error.
Long values particularly happen in address fields. Consider splitting the address over several fields (address1_name, address1_line1, address1_line2 and address1_line3,…).
If you use standard CRM import, you can still save the refused records “on the fly”, split the long values, and repeat the import for refused records.

No special characters in field values
(like line end characters with ASCII codes 10 and 13)
Get rid of them before the import. Otherwise, sooner or later, they will make trouble. For example, if the data are imported from CSV files via standard CRM import, line breaks can break a record line into several, even if text qualifiers are used.
Notepad++ is a good tool to identify them in text files (e. g. CSV).
Control business required values
Control that all business required fields are 100% populated.
Control unexpected intelligence of your tools
If you export data from Excel , or use Excel to process the data, don’t be surprised with telephone numbers like 8.61E+12 or -166.0405405, while the original numbers are 8612345678910 and +32-9/222-99-99. The first one is a rounded “approximate” phone number in scientific format (8.61 x 1012), and the second is a “calculated” phone number (have you ever dialed an approximate negative fractional phone?). Excel doesn’t have cell format for telephone numbers and treats such input as a big numeral in the first case, or as a valid formula in the second case.
Solution: convert such values to text by Excel function TEXT.
Person’s names
Some systems don’t explicitly split contacts’ given names and family names. Foresee the rules which defines given names and family names in Dynamics CRM as adequate as possible.
Take into account that:
-
both given names and family names may be multiple; (e. g. Juan Luis Gonzalez Fernandez)
-
family names may have prepositions, sometimes multiple (e. g. Van Den Bogerd or de la Croix), which have to stay with the names;
-
names may be accompanied by salutations, titles, military ranks in various languages, sometimes multiple, (e. g. Prof. Dr. Ing. Von Hochenbalken) which have to be separated and saved to a different field;
-
delimiting of the above parts may be inconsistent (space, dot with space, just dot);
-
even if you would have a comprehensive dictionary of given names, in some cultures it’s different to distinct family names from given names; (for example Roger Vincent, Laurence Olivier, Marcu Matei)
-
the field lastname always has to get a non-empty value, since it is business required.
A possible action plan is:
-
prepare the dictionary of “normalized” prepositions and the dictionary of “normalized” titles.
-
prepare the script with the functions which identify and split the full name to its parts based on the most typical name format (e. g. [Titles ]Firstname Lastname). This is possible even in Excel with usage of VBA.
Correctly treat “All in one” legacy contact records
Many popular contact management applications (Microsoft Outlook among them) use single record for account and contact. Other systems (like ACT! by Sage) allow limited number of multiple contact persons in one database record. The data model of Dynamics CRM is different, it uses single record per account and single record per contact. If you migrate the data from such sources to the CRM, you have to split it to two Dynamics CRM records and later to assign Parent Customer and Primary Contact relationships between them.
For the original records create unique ID’s and migrate these ID’s to both account and contact records. If the data come from multiple sources, indicate the data source in the ID. Upon importing accounts and contacts, match them using unique ID’s.
One account may have more than one contact. If you want to eliminate duplicate accounts, at the stage of preprocessing define primary records for obsolete accounts (if the data of accounts are identical, or you have no preferences for the primary account, you can do it automatically in Excel using formulas), and then map contacts to primary accounts. Such updates of CRM records and many other handy thongs may be done by the CRM Bulk Update and Export tool available on Codeplex.
Complete this mapping before the users start manual hunting for duplicates.
Find the best balance for duplicate detection and merge
The right moment to merge duplicates (preprocessing, migration, or postprocessing) depends on how clean the data are. The customer may have an idea that source A is cleaner than source B. If so, import source A first and non-duplicate data from source B next.
In principle, the source B can be imported first, followed by source A and update of the data with available values from the source A. Be careful with updates “per field” (like “take the value form better source if not blank, otherwise take the value from the worse source”). For example, a record from the worse source may have data in two address lines (address1_line1 and address1_line2), while the corresponding record form the better source has only address1_line1. As a result, this rule may give a mixture of lines taken from two different addresses.
If after all the customer realizes that he needs company name from one source, e-mail address from another, and VAT code from the third (like a young lady form a classical comedy who searched a groom with nose of Mr. A., ears of Mr. B, and incomes of Mr. C.). If so, either let the customer to clean the data before the migration, or migrate all data, double and triple, to the CRM. CRM allows you to identify duplicates and to merge two records into one with specific selection of sections, fields from each merged records, or selection of the whole record.


Out-of-the-box duplicate detection in CRM is more than basic, but you can enhance it by using calculated match names. Normalize account or contact names by:
-
converting them to lower case and trimming;
-
converting letters with diacritics to standard Latin letters (e.g. Maïté to maite);
-
removal of texts inside parentheses, like alternative spellings or comments;
-
removal of punctuation marks and special characters (at this stage better to replace punctuation marks with spaces, since they delimit single words);
-
removal of common abbreviations, like English Ltd, German GmbH, or French SPRL;
-
removal of spaces.
The order of the above actions is important.
This matching method was compared with six more sophisticated metrics used to find duplicates (Jaro-Winkler, Jaro, Levenstein, Monge-Elkan, Needleman-Wunch, Smith-Waterman-Gotoh) on the migrated business data. In 95% percent of cases it gave the same or better matches as any other method, and the results, even wrong, were much more obvious for the users. The failures of this method normally had words in different order, like “University of Antwerpen” vs. “Antwerpen University”.
A match name plugin used for duplicate detection can be found on Codeplex.
International challenges
Control that the data prepared to import comply with your language and regional settings:
For specific types of values in activity records third party migration tools accept only dates in English (US) format. This can make problems if in client's system the regional settings of Windows or CRM are different. If it happens, change system and CRM data formats to English US, and convert all dates in input data to this format.
Control how your migration tool treats these characters. If you use bulk update tools, verify that their input data have Unicode format.
In principle each record has to be in the language you want to communicate to the customer. But this makes use names of multilingual countries (like Switzerland or Belgium) inconsistent. As a result, validation scripts for country specific formats (like ones for telephone numbers) become more difficult. Instead, you can normalize country names to standard English names. Keep in mind that even if you use ISO standard, it’s not always a perfect reference. Say, the name “Taiwan, Province of China” advised by ISO 3166-1 is an insult for Taiwanese people.
Dynamics CRM Import.
If speech synthesis is comparable with getting toothpaste out of a tube, speech recognition is like trying to get the toothpaste back in. (Dr. J. Holmes, expert in speech processing)
The same is absolutely valid for Dynamics CRM export and import.
Nevertheless, out-of-the-box import can be used to upload contacts from Outlook contact lists exported to Excel. Users can import the data to the CRM by building a data map between the data table in CSV format and the CRM entity attributes, and then by running an import job. Of course, you have to control that
-
all records being imported have all business required values;
-
there are no duplicate records within the import list and between the import list and the CRM (active and inactive);
-
the CSV files have to be really COMMA-separated, since other characters (tab or semicolon) are not accepted in sample files used in manual mapping.
And even after it the import will not help with lookup and picklist attributes, because the export has their values, while the import requires their ID's (GUID's for lookup values, and numeric ID's for picklist values).
There is still a way to import the entities with lookup and picklist attributes. (thanks to CRM guru Jim Steiger) Be careful when following it, since any deviation will stop the process.
Good log
If you run scripts which change users’ data (for example, normalizes telephone numbers), foresee that the script informs your customers about:
o what type of record
o what record
o old value
o new value
-
all suspicious data in fields which are left unchanged,
-
all errors happened during processing, with specific error messages,
-
nothing else (like changing assignment of the old value back to the field).
Since log may get long, foresee easy conversion of the log to spreadsheet or database, which will help to filter by specific type of issue, group records by owner, etc.
Good luck!