How does HubSpot deduplicate contacts on import?

HubSpot matches on email address only. If a contact in your import file has a different email than the same person's existing record, HubSpot creates a new record. If a contact has no email, HubSpot always creates a new record. Clean your list and ensure every contact has an email before importing.

How does Salesforce deduplicate accounts on import?

Salesforce's behavior depends on your import method and Duplicate Rules configuration. By default, account deduplication on import uses account name (exact match, case-insensitive) or domain if populated. For reliable deduplication, include the Salesforce Record ID for any account you're updating rather than relying on name matching.

What's the difference between deduplicating within a file and matching against the CRM?

Within-file deduplication removes duplicate records from your import list itself — two rows that represent the same contact. CRM matching checks whether records in your import already exist in the CRM. Both steps are needed. Skipping the first means you import duplicates. Skipping the second means you create duplicates of existing records.

What if my contacts don't have email addresses?

HubSpot has no deduplication path for contacts without email. Every row without an email creates a new record, even if that person already exists. For contacts without emails, match on name and company against your existing CRM export before importing, and either find the email or exclude those rows.

How long does a proper pre-import clean take?

For a well-structured file under 5,000 rows, steps 1–3 take 10–15 minutes in a spreadsheet. Step 4 (fuzzy deduplication) takes under 5 minutes with a dedicated tool. Step 5 (CRM matching) depends on the size of your existing database but is typically 15–30 minutes. Total: under an hour for most import jobs. Fixing duplicates after the fact typically takes much longer.

Dedupe Checklist: Cleaning Contacts Before HubSpot or Salesforce Import

Most CRM import problems are preventable. HubSpot and Salesforce both deduplicate on exact match only — email address for contacts, domain name or account name for companies. Anything that doesn't match exactly creates a new record. A 20-minute clean before import saves hours of manual merging afterward.

Here's the checklist, in the order it should be done.

Want to dedupe your CSV in under 2 minutes?

Upload your CSV and find duplicates in seconds — no signup, no install, 500 rows free.

Try it for free →

1. Remove Blank and Malformed Rows

Before anything else, remove rows that will fail on import or create empty records.

Delete rows with no name and no email
Delete rows that are clearly test data ("test", "asdf", "123")
Remove header rows that appear mid-file (common in merged exports)
Check for rows where all fields are shifted one column (import mapping errors)

2. Standardize Formatting

Inconsistent formatting causes duplicates even when the underlying data is identical.

Trim leading and trailing spaces from all fields (invisible but cause matching failures)
Normalize capitalization — title case for names, lowercase for emails
Standardize phone number format if you're importing phone numbers
Remove special characters from name fields that your CRM doesn't accept
Check that email addresses are valid format — no missing @, no spaces

3. Normalize Company Names

Company name is the field most likely to create duplicates on import.

Decide on one format for business entity suffixes and apply it consistently — or strip them entirely before comparing. "Inc.", "Incorporated", "Inc" are all the same thing.
Normalize abbreviations — "&" vs "and", "Intl" vs "International"
Flag any company names that are clearly the same entity written differently ("IBM" vs "International Business Machines")

4. Deduplicate Within Your Import File

Before touching your CRM, find duplicate records within the file itself. This is the step most people skip — and the one that creates the most post-import cleanup work.

For exact duplicates (same email, same name): Google Sheets or Excel Remove Duplicates handles this fine.

For near-duplicates (name variants, missing emails, abbreviation differences): you need fuzzy matching. Standard spreadsheet tools will miss "Jennifer Walsh" and "Jen Walsh", or "Acme Corp" and "Acme Corporation".

A fuzzy matching tool for Excel and CSV does this without any setup — upload your file, match on company name and contact name together, download a clean version. It catches the variants that exact-match tools miss and lets you review duplicate clusters before committing.

Key things to catch at this step:

Same contact, two different email addresses
Same company, name written differently across rows
Same person at the same company, submitted twice from different forms
Trade show leads that duplicated contacts already in your import file

5. Match Against Your Existing CRM Data

Your import file might be clean internally but still contain contacts that already exist in your CRM under a different email or slightly different name.

Export your existing CRM contacts/accounts
Compare your import file against that export — flag records that likely already exist
For matches: decide whether to update the existing record or skip the import row
For HubSpot: ensure every contact has an email address — without one, HubSpot creates a new record even if the person already exists
For Salesforce: ensure every account has a domain name or Record ID — Salesforce won't deduplicate on account name alone for all import methods

6. Validate Key Fields for Your CRM's Deduplication Logic

Each CRM has specific fields it uses to detect duplicates on import. If these fields are missing or wrong, you'll get duplicates regardless of how clean the rest of your data is.

HubSpot:

Every contact row has an email address (HubSpot deduplicates contacts on email only)
Every company row has a domain name (HubSpot deduplicates companies on domain only)
Domains are normalized — no "www.", no trailing slashes, consistent format

Salesforce:

Accounts have a website/domain field populated where possible
If updating existing records, include Salesforce Record ID as the unique identifier
Run a small test import (10–20 rows) first to verify deduplication behavior before the full file

7. Final File Check Before Upload

Column headers match exactly what your CRM expects (or you've mapped them)
No merged cells (Excel exports sometimes carry these)
File is saved as CSV or XLSX — not ODS or Numbers format
File size is within your CRM's import limit
You have a backup of the original file before import

Key Takeaways

Do it in order — formatting before deduplication, deduplication before CRM matching. Each step depends on the previous one being done.
Fuzzy matching is not optional for real-world contact data — exact-match tools miss most of the actual duplicates in lists built from multiple sources.
HubSpot and Salesforce both have hard deduplication rules — email for HubSpot contacts, domain for HubSpot companies, Record ID or domain for Salesforce. If those fields are missing, you'll get duplicates regardless of everything else.
A 20-minute clean before import beats hours of manual merging after — the Duplicate Manager in HubSpot and Salesforce shows pairs one at a time, requires manual review, and caps at 2,000–10,000 pairs depending on your plan.