How to Match Two Lists with Fuzzy Logic: Merging a Trade Show Export with Your CRM (No Code)
You come back from a trade show with 400 contacts. You export them from the badge scanner into a CSV. You open your CRM and think: how many of these people are already in here?
You try VLOOKUP on email. It matches 60 of them. You import the other 340 as new contacts.
Six weeks later, your sales team is calling people who closed last year. Your marketing automation is sending onboarding sequences to existing customers. Your account executive is furious.
What happened: the badge scanner captured first name, last name, and company. Half of those contacts were already in your CRM — but under a different email, or a slightly different name, or a company name formatted differently from how your sales rep entered it. VLOOKUP found the 60 exact email matches and missed the other 280 contacts who were already there.
This is the two-list matching problem — and it's distinct from deduplication. Understanding the difference is the first step to solving it.
Deduplication vs. Reconciliation: What's the Difference?
These two terms get used interchangeably, but they describe different problems requiring different approaches.
Deduplication operates on a single list. You have one file with 1,000 contacts and you want to collapse duplicates within it — "Jennifer Walsh" and "Jen Walsh" both in the same export, representing the same person. The output is a clean version of that one list.
Reconciliation (or two-list matching) operates across two lists. You have a source list (your trade show export) and a reference list (your CRM), and you want to know: which records in the source already exist in the reference? The output is a mapping — which source records match which.
| Deduplication | Reconciliation | |
|---|---|---|
| How many lists? | One | Two |
| Question being answered | Which records in this list are the same? | Which records in list A already exist in list B? |
| Output | Clustered / cleaned single list | Matched pairs + unmatched records |
| Typical scenario | Clean a contact list before import | Trade show export vs. CRM, enriched leads vs. existing accounts, vendor list vs. ERP |
Most tools do deduplication. Far fewer do reconciliation cleanly — which is why this specific problem gets solved with VLOOKUP and manual review far more often than it should.
Why VLOOKUP (and Exact Matching Generally) Fails Here
VLOOKUP, INDEX/MATCH, and similar spreadsheet formulas are exact match tools. They answer: "Does this exact string appear in this column?" If the answer is no — even by a single character — they return nothing.
For trade show data matched against CRM data, the failure modes are predictable:
- Email mismatch. Badge scanners often don't capture email — or they capture a personal email while your CRM has a work email. VLOOKUP on email misses both cases entirely.
- Name formatting differences. "Bob Chen" in the badge export vs. "Robert Chen" in the CRM. "Sarah O'Brien" vs. "Sarah OBrien" vs. "S. O'Brien". Exact match: zero results.
- Company name variants. Your sales rep entered "Acme" in the CRM. The badge scanner captured "Acme Corporation". VLOOKUP: no match.
- Missing fields. Trade show exports frequently have incomplete records — first name only, or company but no name. Exact match on incomplete data always fails.
The result is a systematic false negative problem: VLOOKUP tells you a contact isn't in your CRM when they actually are, just stored slightly differently. You import them as new. Duplicates accumulate.
Key Takeaways (So Far)
- Deduplication = finding duplicates within one list. Reconciliation = matching records across two lists. Different problems, different tools.
- VLOOKUP fails on two-list matching because it requires exact character matches — one formatting difference means no match.
- The most common failure modes: different emails, name variants, company name differences, missing fields.
- The correct tool is fuzzy matching across two files — scoring similarity between records rather than checking equality.
What Two-List Fuzzy Matching Actually Does
Instead of checking whether a string in list A is identical to a string in list B, fuzzy two-list matching scores the similarity between every candidate pair and surfaces pairs above your threshold as likely matches.
In practice, for each record in your source list (trade show export), the tool finds the most similar record in your reference list (CRM export) and returns a similarity score. You set a threshold — typically 0.80 for names and company names — and pairs above it get flagged as matches.
Here's what that looks like on a small example:
Source list (trade show export):
| # | Name | Company |
|---|---|---|
| 1 | Bob Chen | Global Partners |
| 2 | Jennifer Walsh | Acme Corp |
| 3 | Marcus Williams | Initech LLC |
| 4 | Sophie Turner | Globex Industries |
Reference list (CRM export):
| # | Name | Company |
|---|---|---|
| A | Robert Chen | Global Partners LLC |
| B | Jen Walsh | Acme Corporation |
| C | Marcus Williams | Initech |
| D | Laura Pérez | DataSoft Inc |
Fuzzy reconciliation results (threshold: 0.80):
| Source record | Best CRM match | Similarity | Status |
|---|---|---|---|
| Bob Chen / Global Partners | Robert Chen / Global Partners LLC | 0.84 | ✅ Likely match — review |
| Jennifer Walsh / Acme Corp | Jen Walsh / Acme Corporation | 0.91 | ✅ Likely match — review |
| Marcus Williams / Initech LLC | Marcus Williams / Initech | 0.96 | ✅ Strong match |
| Sophie Turner / Globex Industries | — | 0.41 | 🆕 Genuinely new |
VLOOKUP on name would have caught Marcus Williams (identical name) and missed the other three. Fuzzy reconciliation catches all three existing contacts and correctly identifies Sophie Turner as the only genuinely new lead.
The Trade Show Scenario: A Full Walkthrough
Here's the end-to-end workflow for matching a trade show export against your CRM before importing.
Step 1: Export your CRM contacts
Pull a full export of your existing contacts — at minimum, name, company, and email. This becomes your reference list. Don't filter it — you want the full database, not just recent records.
If your CRM has tens of thousands of contacts, that's fine. A properly built reconciliation tool handles the scale. (The math gets exponential with naive approaches — 400 source records × 50,000 reference records = 20 million comparisons — which is exactly why purpose-built tools exist.)
Step 2: Prepare your source file
Clean up your trade show export before matching:
- Standardize name fields — if first and last are in separate columns, combine them into a single "Full Name" column for comparison
- Remove rows with no name and no company — you can't match what isn't there
- Check for obvious formatting issues: extra spaces, all-caps, special characters from scanner encoding errors
Don't over-clean at this stage. The whole point of fuzzy matching is that it handles variation — you don't need perfect data, just workable data.
Step 3: Run the reconciliation
Upload both files to a reconciliation tool. Specify which column in the source matches which column in the reference — typically name-to-name and company-to-company.
Set your threshold. For contact names, 0.85 is a reasonable starting point. For company names, 0.80. If you're matching on both fields simultaneously, some tools let you weight them — name similarity weighted 60%, company 40%, for example.
The output will be three categories:
- Matched pairs: source records with a high-confidence match in the CRM
- Review candidates: source records with a plausible but uncertain match (just below your threshold)
- New contacts: source records with no match above your minimum threshold
Step 4: Review the candidates, then act
Don't auto-merge everything above your threshold blindly. Spot-check 15–20 matched pairs to calibrate your confidence. If you're seeing obvious errors (unrelated people scoring 0.83), raise your threshold. If you're seeing obvious matches being missed, lower it.
Then act on each category:
- Matched pairs: Don't import these as new contacts. Instead, update the existing CRM record if the new data adds anything (a phone number, their current job title, an event tag).
- Review candidates: Open a sample manually and decide. For a 400-row trade show list, this is usually 20–40 rows — 10 minutes of human judgment, not an afternoon.
- New contacts: These are your genuine net-new leads. Import them. These are the only rows that should become new CRM records.
What About Excel's Fuzzy Merge (Power Query)?
Excel's Power Query has a built-in fuzzy merge feature. You can join two tables on a column with approximate matching — it's the closest Excel gets to two-list reconciliation natively.
The limitations are real and worth knowing:
- Single column matching only. Power Query's fuzzy merge matches on one column at a time. You can't combine name similarity and company similarity into a single score. This is a significant limitation for contact data where no single field is reliable.
- No similarity score in the output. You get matched or unmatched — not the score that drove the decision. You can't see why two records were matched or tune based on how confident the match was.
- Scale limitations. For large reference lists, fuzzy merge in Power Query gets slow. A 400-row source against a 50,000-row CRM export can take 10–20 minutes, and the results are not always reliable at that scale.
- No batch export for review. The matched output goes directly into a merged table — there's no intermediate review step where you can inspect the candidate pairs before committing.
For quick checks on small, clean data, Power Query fuzzy merge is usable. For anything resembling a real trade show reconciliation workflow, its limitations will cost you more time than they save.
What to Look for in a Two-List Reconciliation Tool
Since this is a distinct problem from single-list deduplication, the features that matter are different:
- Two-file input, clearly separated. The tool needs to understand which file is the source and which is the reference — and treat them differently. A source record matching nothing is a new lead to import. A reference record matching nothing just means nobody at that company attended your event. Those are different outcomes.
- Multi-column matching. Matching on name alone produces false positives (there are many "David Lee"s in any large database). Matching on company alone is too loose. The best results come from matching on name and company together, weighted appropriately.
- Similarity score in the output. You need to be able to see the score for every matched pair — not just whether a match was found. The score is what you use to calibrate your threshold and spot-check your results.
- Three-way output: matches, candidates, new. The output should clearly separate high-confidence matches, borderline candidates, and unmatched records. Bundling these together forces more manual review, not less.
- Scale handling. If your CRM has 100,000 contacts and your trade show export has 600 rows, the tool needs to handle 60 million candidate comparisons efficiently. If it runs in a browser tab or a Google Sheet, it probably doesn't.
- Downloadable results. The output should be a CSV you can work with — not a web UI you have to re-export manually.
Other Scenarios Where Two-List Matching Solves the Same Problem
The trade show use case is the most common, but the same technique applies anywhere you're comparing a new list against an existing reference:
Enriched lead list vs. CRM
You bought a list of 2,000 target accounts from a data provider. Before importing, you want to know which ones are already customers, opportunities, or disqualified prospects. Same workflow, different source list.
Vendor list vs. ERP
Your finance team has a list of 500 vendors in a new spreadsheet. Your ERP has 3,000 supplier records. Which vendors are new, which are duplicates under slightly different names? This is exactly the reconciliation problem — applied to suppliers instead of contacts.
Product catalog matching
An incoming vendor sends you 800 SKUs. Your master catalog has 15,000 products. Which incoming SKUs already exist under a slightly different product name or code? Same technique, different domain.
Post-merger database consolidation
Two companies merge, each with their own CRM. Which contacts appear in both systems? Which accounts are the same company entered differently? Two-list fuzzy reconciliation is the core technique for any database consolidation.
The underlying problem — I have two lists that use messy text to describe the same things, and I want to know what overlaps — is one of the most common data problems in business. It just looks different depending on the domain.
Key Takeaways
- Reconciliation is not deduplication: it's matching across two lists, not cleaning one.
- VLOOKUP fails on this problem because it requires exact matches — one formatting difference returns nothing.
- A fuzzy reconciliation workflow produces three outputs: matched pairs (existing), review candidates (uncertain), and genuinely new records.
- Always review a sample of matched pairs before acting — calibrate your threshold to your data.
- The same technique applies to vendor lists, product catalogs, enriched leads, and post-merger consolidations — wherever two messy lists need to be compared.
- For large reference lists (50,000+ CRM records), use a tool that handles scale — browser-based and spreadsheet tools get slow or fail above this threshold.
Frequently Asked Questions
Match Your Next List Before It Creates Duplicates
Upload your source file and reference file, set your threshold, and get back a clear view of what's new, what already exists, and what needs review — before anything touches your CRM.
This is Part 3 of a series on fuzzy matching for non-technical teams.
Part 1: How to Find and Merge Duplicate Company Names in a Spreadsheet or CSV
Part 2: How to Deduplicate Your Contact List Before Importing to HubSpot