CRM migrations are one of the most common causes of a permanently messy database. You spend weeks planning the migration, and on day one the new CRM already has duplicate accounts because "Acme Corp" and "Acme Corporation" both came over as separate records — along with every other formatting inconsistency that built up over years in the old system.
The old CRM had its own deduplication logic. The new one has different rules. Neither catches name variants. And by the time you notice, the duplicates are already associated with deals, activities, and email history that makes merging them painful.
The fix is doing the cleaning work before migration, not after.
Want to dedupe your CSV in under 2 minutes?
Upload your CSV and find duplicates in seconds — no signup, no install, 1,000 rows free.
Try it for free →Why Migrations Create More Duplicates Than Routine Imports
A routine import might bring in 500 new contacts. A CRM migration brings in everything — years of accumulated records from multiple sources, multiple teams, and multiple data entry habits. That means:
- Contacts entered manually by different reps with different conventions
- Companies imported from data vendors with their own naming formats
- Records created automatically by integrations that never normalized names
- Contacts added via form fills, trade shows, and manual CSV imports over years
- The same company entered 50 times as "Microsoft", "Microsoft Corp", "Microsoft Corporation", "MSFT", and "Microsoft Inc."
None of these were a problem in the old CRM because nobody was looking for them. Moving to a new CRM is the first time they're all in one place being examined — and the new CRM's deduplication rules will miss most of them.
Step 1: Export Everything and Audit First
Before cleaning, understand what you're working with.
Export your full contact and company database from the old CRM. Open it and look at the company name column specifically. Sort alphabetically and scan for variations of the same company. In most databases that are a few years old, you'll find the same companies appearing dozens of times with slightly different names.
This isn't about fixing it manually — it's about understanding the scale of the problem before you decide how to tackle it. A database with 5,000 records and 200 duplicate company groups needs a different approach than 50,000 records with 2,000 duplicate groups.
Step 2: Normalize Company Names
The highest-leverage single step in any migration clean is normalizing company names before deduplicating. Two records that look different often become identical once the noise is removed.
What to normalize:
- Business entity suffixes — strip or standardize Inc., Corp., LLC, Ltd., GmbH, PLC, Limited, Incorporated. "Acme Corp" and "Acme Corporation" both become "Acme" and are caught as exact duplicates.
- Capitalization — lowercase everything for comparison purposes
- Punctuation — remove periods, commas, ampersands (or normalize "&" to "and")
- Common abbreviations — "Intl" → "International", "Co" → "Company" (or just strip these)
You can do this with spreadsheet formulas for simple cases. For larger exports with many variations, a dedicated tool handles this automatically as a preprocessing step.
Step 3: Deduplicate Within the Export
After normalizing, deduplicate. This is where most migration projects fail — they either skip this step entirely or use a tool that only catches exact matches.
What exact-match deduplication misses: Even after normalization, you'll still have records that are clearly the same company or contact but don't match character-for-character. "Jennifer Walsh" and "Jen Walsh". "Global Partners" entered once with a typo as "Globl Partners". A contact whose last name is spelled differently across two records.
These require fuzzy matching — scoring how similar two records are rather than checking if they're identical.
Clean by Similarity API handles both normalization and fuzzy matching in one step. Upload your export, select company name and contact name as matching columns, and it groups near-duplicate records with similarity scores. You review the clusters, choose the canonical record, and download a clean file. No install, no account needed to get started.
What to catch at this step:
- Same company, name variant (covered by fuzzy matching on company name)
- Same contact at the same company (match on name + company together)
- Same contact, two different email addresses — especially common after job changes
- Contacts with no email that duplicate existing records
Step 4: Decide What to Migrate
Not everything in the old CRM deserves to come over. Migration is a good forcing function to also purge records that aren't worth keeping.
Consider excluding:
- Contacts who bounced every email and never engaged
- Companies with no associated contacts or deals
- Records that are clearly test data or placeholders
- Contacts who unsubscribed years ago with no activity since
A smaller, cleaner database in the new CRM is more valuable than a large, messy one. Your new CRM's pricing may also be based on contact count — less is often cheaper.
Step 5: Validate Against the New CRM's Rules
Before importing into the new CRM, understand how it deduplicates — and make sure your clean file is structured to take advantage of it.
HubSpot:
- Contacts deduplicate on email address. Every contact needs an email or HubSpot creates a new record regardless of name match.
- Companies deduplicate on domain name. Populate the domain field for every company or you'll get duplicates even for records you just cleaned.
Salesforce:
- Accounts deduplicate on account name (exact match) or Record ID. For a fresh migration with no existing Salesforce records, name normalization is your main protection.
- Run a test import of 20–30 records first to verify behavior before the full migration.
Any CRM:
Do a small test import before the full one. Always.
Key Takeaways
- CRM migrations surface years of accumulated duplicates all at once — cleaning before migration is the only practical approach
- Normalization (stripping suffixes, standardizing casing) should happen before deduplication — it turns many fuzzy matches into exact matches and simplifies the whole process
- Fuzzy matching is essential for contact and company data — exact-match tools miss most real-world duplicates in a multi-year database export
- Migration is also the right moment to purge low-quality records — a smaller, cleaner starting database is more valuable than a complete but messy one
- Validate your clean file against the new CRM's specific deduplication logic before importing — HubSpot and Salesforce both have hard rules that override everything else
Free for files up to 1,000 rows. No signup required.