How to Deduplicate Account and Contact Records Before Importing to Salesforce

April 202610 min readBy Similarity API Team

If you've ever imported a list into Salesforce and immediately found duplicate accounts — "Acme Corp" sitting next to "Acme Corporation", or "Global Partners LLC" next to "Global Partners" — you already know the problem. Salesforce didn't create those duplicates to frustrate you. It created them because its native deduplication has real limits, and most real-world data falls outside those limits.

The fix is almost always the same: clean the file before it touches Salesforce, not after.

What Salesforce Actually Deduplicates (and What It Misses)

Salesforce has a built-in Duplicate Management feature — but it's worth being precise about what it does.

For contacts, Salesforce matches on email address. Same logic as HubSpot: if the email matches exactly, it flags a potential duplicate. If it doesn't, it doesn't.

For accounts, Salesforce matches on account name — but only exact matches, case-insensitive. "Acme Corp" and "acme corp" are caught. "Acme Corp" and "Acme Corporation" are not.

ScenarioSalesforce behavior
Same contact, same email✅ Flags as duplicate
Same contact, two different emails❌ Creates two records
"Acme Corp" vs "Acme Corporation"❌ Creates two records
"Global Partners LLC" vs "Global Partners"❌ Creates two records
Contact with no email❌ No deduplication at all

The pattern: anything requiring judgment about similarity rather than exact matching gets through.

There's also a practical constraint: Salesforce's Duplicate Rules and Matching Rules are only available on Enterprise and above. If you're on Professional, you have limited native deduplication tooling.

Where the Duplicates Actually Come From

Most import duplicates don't come from careless data entry. They come from combining data that was collected in different contexts:

  • Data vendor lists. Third-party enrichment providers use their own company name conventions. Your existing Salesforce accounts use yours. They rarely match exactly.
  • Trade show exports. Badge scanner exports produce company names as attendees typed them at registration — inconsistent abbreviations, missing suffixes, typos.
  • CRM migrations. Moving data from HubSpot, Pipedrive, or a legacy system almost always introduces account name variations. The old CRM had its own formatting conventions.
  • Multiple team members entering the same account. One rep writes "Johnson & Johnson", another writes "Johnson and Johnson". Both are in your import file.

None of these get caught by exact-match deduplication. They all create new records on import.

Key Takeaways (So Far)

  • Salesforce deduplicates contacts on email (exact match) and accounts on name (exact match, case-insensitive)
  • Name variations — abbreviations, missing suffixes, formatting differences — all create duplicate accounts
  • Contacts without emails are not deduplicated at all
  • Duplicate Rules are Enterprise-only; Professional users have very limited native tooling

The Fix: Clean Before You Import

Fixing duplicates inside Salesforce after the fact is painful. The Duplicate Manager shows you pairs to review one at a time. Third-party tools like Dedupely or Cloudingo help at scale but add cost and complexity. The cleaner path is preventing the problem.

Three steps before any Salesforce import:

1. Normalize your account names

Strip business entity suffixes before comparing: Inc., Corp., LLC., Ltd., GmbH, PLC. Lowercase everything. Remove punctuation. This alone eliminates a large proportion of near-duplicates because the underlying name is often identical once the suffix is gone — "Acme Corp." and "Acme Incorporated" both normalize to "acme".

2. Deduplicate within your import file

Before touching Salesforce, find records in your file that refer to the same account or contact. For exact duplicates, a standard Remove Duplicates works. For name variants — "Jennifer Walsh at Acme Corp" vs "Jen Walsh at Acme Corporation" — you need fuzzy matching that scores similarity rather than checking for identical strings.

3. Match against your existing Salesforce data

Export your existing accounts and contacts from Salesforce, then compare your import file against that export. Records that score as likely matches already exist in Salesforce — you want to update those records rather than create new ones.

What a Pre-Import Clean Typically Finds

On a 500-row account list from a data vendor or event export:

  • 20–40 internal duplicates in the file itself — same company, different formatting
  • 30–60 accounts that already exist in Salesforce under a slightly different name
  • 10–20 contacts without emails who would create new records even if the account already exists

Net result: what looks like 500 new records is typically 400 or fewer genuinely new ones. The rest are updates to existing records or internal duplicates — both of which are better resolved before import.

Key Takeaways

  • Clean before you import — resolving duplicates in a spreadsheet takes minutes; resolving them in Salesforce takes hours
  • Fuzzy matching on account name catches the variants exact matching misses — abbreviations, suffixes, word order differences
  • Matching name + contact together is more reliable than matching on either field alone — "Jen Walsh at Acme Corp" and "Jennifer Walsh at Acme Corporation" score as a likely match on the combination even if neither field is identical individually
  • Three steps: normalize, deduplicate within file, match against existing Salesforce export

Clean Your Import File Before It Hits Salesforce

Upload your account or contact list, find near-duplicate records even when they're spelled differently, and download a clean file ready to import — no code, no formulas, no manual pair review after the fact.

Free for files up to 1,000 rows. No signup required.

Frequently Asked Questions