How long does cleaning a CRM export take before migration?

For a well-structured export under 10,000 records, normalization and fuzzy deduplication takes 1–2 hours with the right tools. For larger databases (50,000+ records), budget half a day. This is dramatically less time than fixing duplicates inside the new CRM after the fact, where merging records one-by-one with split activity history can take days or weeks.

Should I clean the data in my old CRM or after exporting?

After exporting is easier. Working inside the old CRM means dealing with its interface, its merge logic, and its limitations. Exporting to a CSV and cleaning the file gives you full control, lets you use dedicated tools, and doesn't risk corrupting the old system while you still need it.

What if my old CRM already has a deduplication tool?

Use it before exporting — it reduces the problem. But don't rely on it. CRM-native deduplication tools catch obvious duplicates but miss name variants, records without emails or domains, and contacts entered across years with inconsistent formatting. A dedicated clean of the export file catches what the CRM missed.

What's the best way to handle contacts with no email addresses during migration?

Match them against other records by name and company using fuzzy matching before importing. If they match an existing record in your export, merge them. If they're genuinely unique but emailless, decide whether they're worth migrating at all — without an email, HubSpot and most other CRMs can't deduplicate them on future imports, and they'll cause problems again.

Do I need to clean both contacts and companies separately?

Yes, but they're related. Clean companies first — normalize company names, deduplicate, establish your canonical company list. Then clean contacts with those normalized company names in place. This way contact deduplication can match on both name and company simultaneously, which is more reliable than either field alone.

How to Dedupe Your Contact List Before a CRM Migration

CRM migrations are one of the most common causes of a permanently messy database. You spend weeks planning the migration, and on day one the new CRM already has duplicate accounts because "Acme Corp" and "Acme Corporation" both came over as separate records — along with every other formatting inconsistency that built up over years in the old system.

The old CRM had its own deduplication logic. The new one has different rules. Neither catches name variants. And by the time you notice, the duplicates are already associated with deals, activities, and email history that makes merging them painful.

The fix is doing the cleaning work before migration, not after.

Want to dedupe your CSV in under 2 minutes?

Upload your CSV and find duplicates in seconds — no signup, no install, 500 rows free.

Try it for free →

Why Migrations Create More Duplicates Than Routine Imports

A routine import might bring in 500 new contacts. A CRM migration brings in everything — years of accumulated records from multiple sources, multiple teams, and multiple data entry habits. That means:

Contacts entered manually by different reps with different conventions
Companies imported from data vendors with their own naming formats
Records created automatically by integrations that never normalized names
Contacts added via form fills, trade shows, and manual CSV imports over years
The same company entered 50 times as "Microsoft", "Microsoft Corp", "Microsoft Corporation", "MSFT", and "Microsoft Inc."

None of these were a problem in the old CRM because nobody was looking for them. Moving to a new CRM is the first time they're all in one place being examined — and the new CRM's deduplication rules will miss most of them.

Step 1: Export Everything and Audit First

Before cleaning, understand what you're working with.

Export your full contact and company database from the old CRM. Open it and look at the company name column specifically. Sort alphabetically and scan for variations of the same company. In most databases that are a few years old, you'll find the same companies appearing dozens of times with slightly different names.

This isn't about fixing it manually — it's about understanding the scale of the problem before you decide how to tackle it. A database with 5,000 records and 200 duplicate company groups needs a different approach than 50,000 records with 2,000 duplicate groups.

Step 2: Normalize Company Names

The highest-leverage single step in any migration clean is normalizing company names before deduplicating. Two records that look different often become identical once the noise is removed.

What to normalize:

Business entity suffixes — strip or standardize Inc., Corp., LLC, Ltd., GmbH, PLC, Limited, Incorporated. "Acme Corp" and "Acme Corporation" both become "Acme" and are caught as exact duplicates.
Capitalization — lowercase everything for comparison purposes
Punctuation — remove periods, commas, ampersands (or normalize "&" to "and")
Common abbreviations — "Intl" → "International", "Co" → "Company" (or just strip these)

You can do this with spreadsheet formulas for simple cases. For larger exports with many variations, a dedicated tool handles this automatically as a preprocessing step.

Step 3: Deduplicate Within the Export

After normalizing, deduplicate. This is where most migration projects fail — they either skip this step entirely or use a tool that only catches exact matches.

What exact-match deduplication misses: Even after normalization, you'll still have records that are clearly the same company or contact but don't match character-for-character. "Jennifer Walsh" and "Jen Walsh". "Global Partners" entered once with a typo as "Globl Partners". A contact whose last name is spelled differently across two records.

These require fuzzy matching — scoring how similar two records are rather than checking if they're identical.

A fuzzy matching tool for Excel and CSV handles both normalization and fuzzy matching in one step. Upload your export, select company name and contact name as matching columns, and it groups near-duplicate records with similarity scores. You review the clusters, choose the canonical record, and download a clean file. No install, no account needed to get started.

What to catch at this step:

Same company, name variant (covered by fuzzy matching on company name)
Same contact at the same company (match on name + company together)
Same contact, two different email addresses — especially common after job changes
Contacts with no email that duplicate existing records

Step 4: Decide What to Migrate

Not everything in the old CRM deserves to come over. Migration is a good forcing function to also purge records that aren't worth keeping.

Consider excluding:

Contacts who bounced every email and never engaged
Companies with no associated contacts or deals
Records that are clearly test data or placeholders
Contacts who unsubscribed years ago with no activity since

A smaller, cleaner database in the new CRM is more valuable than a large, messy one. Your new CRM's pricing may also be based on contact count — less is often cheaper.

Step 5: Validate Against the New CRM's Rules

Before importing into the new CRM, understand how it deduplicates — and make sure your clean file is structured to take advantage of it.

HubSpot:

Contacts deduplicate on email address. Every contact needs an email or HubSpot creates a new record regardless of name match.
Companies deduplicate on domain name. Populate the domain field for every company or you'll get duplicates even for records you just cleaned.

Salesforce:

Accounts deduplicate on account name (exact match) or Record ID. For a fresh migration with no existing Salesforce records, name normalization is your main protection.
Run a test import of 20–30 records first to verify behavior before the full migration.

Any CRM:

Do a small test import before the full one. Always.

Key Takeaways

CRM migrations surface years of accumulated duplicates all at once — cleaning before migration is the only practical approach
Normalization (stripping suffixes, standardizing casing) should happen before deduplication — it turns many fuzzy matches into exact matches and simplifies the whole process
Fuzzy matching is essential for contact and company data — exact-match tools miss most real-world duplicates in a multi-year database export
Migration is also the right moment to purge low-quality records — a smaller, cleaner starting database is more valuable than a complete but messy one
Validate your clean file against the new CRM's specific deduplication logic before importing — HubSpot and Salesforce both have hard rules that override everything else