Why does Excel's Remove Duplicates miss most of my duplicates?

Because it does an exact match. Any difference — an extra space, a period, 'Inc.' vs 'Incorporated', a single typo — and Excel considers the rows different. For real contact and company data, most duplicates differ by something. Fuzzy matching exists to bridge that gap.

Is the Microsoft Fuzzy Lookup add-in still usable in 2026?

Technically yes, on Windows. It hasn't been meaningfully updated since 2011, isn't supported on Mac or Excel for the web, and struggles on files above a few thousand rows. For a one-off small file it's fine; for anything regular, it's not the right tool.

Doesn't Power Query in Excel already do fuzzy matching?

It does, but it's a fuzzy merge tool, not a fuzzy dedupe tool. Using it on a single file means joining the table to itself, which is awkward to set up, slow on tens of thousands of rows, and gives you very little insight into why pairs did or didn't match. It also can't combine multiple columns into one similarity decision the way a real dedupe tool can.

Why does fuzzy matching freeze Excel?

Because in the worst case, fuzzy matching compares every row to every other row — work that grows with the square of your dataset size. Excel runs on your laptop in a single thread; cloud tools spread the work across multiple servers and use a tuned algorithm built specifically for this job.

Can I match on more than one column at the same time?

In Clean by Similarity API, yes — pick name and company together and the tool combines them into one similarity score for the row. That's how 'Jen Walsh at Acme Corp' gets correctly identified as a duplicate of 'Jennifer Walsh at Acme Corporation' even though neither field on its own is identical.

Is it safe to upload my data to an online fuzzy matching tool?

Clean processes your file in memory and deletes it after your session. Your data is never used for training, never sold, never shared. As good practice on any online tool, you can also send only the column you need to match on rather than the entire file.

Does it work with .xlsx files, or do I have to convert to CSV?

Drop the .xlsx directly — Clean parses it, handles multi-sheet workbooks, and detects headers automatically. No 'Save as CSV' step required.

Fuzzy Matching in Excel: Why It's Broken + the Fastest Fix

TL;DR

In 2026, Excel still has no built-in way to find near-duplicate names like "Acme Corp" and "Acme Corporation". Remove Duplicates only catches identical rows, the old Fuzzy Lookup add-in is a 2011 Windows-only relic, and Power Query's fuzzy merge bogs down on tens of thousands of rows.

The fastest reliable fix in 2026 is to drop your file into an online tool that runs the matching in the cloud. Clean by Similarity API does exactly this — free for files under 500 rows, no signup, no install, and you can tweak the results before downloading.

First, what is fuzzy matching?

Fuzzy matching (also called fuzzy lookup, near-duplicate detection, or approximate string matching) is the technique of finding records that look like the same thing but aren't typed identically. It's what lets a tool see that all of these refer to one company:

Acme Corp
Acme Corporation
ACME CORP.
acme corp

A fuzzy matcher gives every pair of records a similarity score between 0 (totally different) and 1 (identical). You set a threshold — usually somewhere between 0.80 and 0.95 — and anything above it is treated as the same entity. Used well, fuzzy matching collapses a messy spreadsheet of customers, leads, or companies into one row per real entity.

Here's what the problem actually looks like

Below is a tiny sample contact file. It has 12 rows — but if you look closely, they describe just 5 real people. Toggle the highlighting on and off to see the duplicates, then click through the three output formats to see what a fuzzy matcher returns.

Interactive sample

contacts.xlsx — 12 rows, 5 real people

Color duplicates

Input file

The same 5 people, written 12 different ways. Remove Duplicates in Excel finds zero matches — every row is technically unique.

#	name	email	company
1	Jennifer Walsh	jen.walsh@acme.com	Acme Corp
2	Jen Walsh	j.walsh@acme.com	Acme Corporation
3	JENNIFER WALSH	jennifer@acme.com	acme corp.
4	Mike O'Brien	mike@globex.io	Globex Inc.
5	Michael O'Brien	michael@globex.io	Globex, Inc
6	M. O'Brien	mob@globex.io	Globex
7	Sara Lee	sara@initech.com	Initech LLC
8	Sara Lee	sara@initech.com	Initech, LLC
9	David Kim	dkim@stark.com	Stark Industries
10	Dave Kim	d.kim@stark.com	Stark Industries Ltd.
11	Priya Patel	priya@wayne.co	Wayne Enterprises
12	Priya P.	priya.p@wayne.co	Wayne Ent.

What you get back — click between the three output formats

One row per real person. Ready to import — 12 messy rows collapsed to 5.

name	email	company	rows merged
Jennifer Walsh	jen.walsh@acme.com	Acme Corp	3
Mike O'Brien	mike@globex.io	Globex Inc.	3
Sara Lee	sara@initech.com	Initech LLC	2
David Kim	dkim@stark.com	Stark Industries	2
Priya Patel	priya@wayne.co	Wayne Enterprises	2

Excel sees 12 unique rows. Every name, email, and company is spelled at least one character differently. Remove Duplicates returns "0 duplicate values found". That's the gap we're solving.

Why can I not delete duplicates in Excel?

You can — but only the easy ones. Excel's built-in Remove Duplicates button does an exact, character-for-character match. If two rows differ by a period, a space, the word "Inc.", or a typo, Excel considers them different rows and keeps both. For real CRM exports, trade-show lists, or Apollo/ZoomInfo dumps, that means most duplicates survive.

The real question is why Excel doesn't have a proper fuzzy match feature in 2026. There are three reasons, and they compound on each other.

1. Fuzzy matching is computationally expensive

An exact dedupe just sorts the column and walks down it — trivial work, even on a million rows. Fuzzy matching is fundamentally different: to know whether row 47 is similar to row 892, you have to compare them. In the worst case, every row has to be compared to every other row. That's N × N comparisons.

For 1,000 rows that's a million comparisons — fine. For 20,000 rows it's 400 million. For 100,000 rows it's 10 billion. Excel runs on your laptop, in a single thread, with a memory ceiling. It was simply never designed to do this kind of work, and any honest implementation inside Excel would freeze the application for minutes or hours on the kind of files people actually want to clean.

Modern online tools dodge this by running the matching on cloud servers with a proprietary algorithm that's tuned for this exact job — faster than most fuzzy-matching APIs and orders of magnitude faster than anything you can do inside a spreadsheet.

2. The add-ins Microsoft shipped never grew up

Microsoft did try. In 2011, they released a free Fuzzy Lookup add-in for Excel. It still exists. It still works on small files. And it has been functionally untouched for over a decade. It's Windows-only, doesn't run on Excel for Mac, doesn't run on Excel for the web, and isn't supported on M365 in any meaningful way. There's no roadmap, no updates, no official support channel. If you've tried to install it on a modern machine, you know.

The newer answer from Microsoft is Power Query's fuzzy merge, available in Excel and Power BI. It's the closest thing to a real built-in fuzzy matcher, and for small-to-medium files it works. But:

It's a merge tool, designed to join two tables, not to deduplicate one. Using it for dedupe means joining a table to itself, which doubles the work and is genuinely awkward to set up.
The similarity algorithm is a single knob (a threshold from 0 to 1) with almost no transparency — there's no way to see why two records matched or didn't, and no audit trail to show stakeholders.
Performance falls off a cliff somewhere in the tens of thousands of rows. People report Power Query queries that take 20 minutes, hang, or never finish on files Clean can process in under a minute.
It can't do multi-column matching intelligently — you can match on multiple columns, but each column has its own independent threshold rather than a combined similarity score across the row.

The third option people reach for is Python in Excel. It's powerful, but you're now writing pandas and RapidFuzz code inside a spreadsheet cell, paying for an M365 add-on, and waiting on cloud Python to spin up for every recalculation. That's not a fix for the people who are asking how to dedupe in Excel — it's a different product.

3. What actually breaks for you, in practice

When someone tells us "I can't dedupe this in Excel", they almost always mean one of these five things:

Excel freezes or crashes. Power Query's fuzzy merge or a Python-in-Excel script hangs on a file of 30k–80k rows. The application becomes unresponsive and you eventually force-quit.
The results are wrong in both directions. Either nothing matches (threshold too strict), or "Acme Corp" gets merged with "Acme Healthcare" (threshold too loose). And there's no obvious way to iterate.
You can't match on multiple columns together. "Jen Walsh at Acme Corp" and "Jennifer Walsh at Acme Corporation" should clearly be one person, but neither the name nor the company is identical, and Excel can't combine them into one decision.
You can't clean the data the way you need before matching. Stripping "Inc.", "LLC", "GmbH", "Ltd.", handling word-order differences ("Coca-Cola Company" vs "The Coca-Cola Company"), normalizing punctuation — none of this is a checkbox in Excel.
You can't review or undo. Once Power Query has merged two records, that decision is buried in the query. No "show me the pairs you matched, with scores, before committing".

Why an online tool is the fastest reliable answer in 2026

In 2026, the bottleneck stops being "is there an algorithm that can do this" and becomes "where does the work run". An online fuzzy-matching tool fixes all three of Excel's problems at once:

The matching runs on cloud servers, not your laptop — so a 50,000-row file finishes in seconds without freezing anything.
Nothing to install — no add-in, no Python, no Java app, no IT ticket. Drop your file into a browser tab and you're done.
You see the results before committing. Inspect the pairs, raise or lower the sensitivity, re-run, and only download once you're happy.

Most teams we talk to spend more time on a single failed Excel dedupe attempt than they would spend on the whole task in an online tool. If you want to try it on your own file right now, you can open Clean and drop in your CSV or Excel file — free for files under 500 rows, no signup.

How Clean does it (and how it's different)

Clean by Similarity API is built around the file-drop workflow we just described. Here's what makes it work on the kinds of files Excel chokes on.

It uses AI to figure out your specific use case

When you upload your file, Clean inspects a sample of the rows and the column you've picked, and tells you what it thinks the data is — a list of company names, a contact list with first/last names, a SKU catalog, an address book. Different data needs different matching behaviour, and Clean uses that read to pre-fill smart defaults: which sensitivity threshold is sane, whether to ignore casing, whether to strip company suffixes like Inc. and Ltd., whether word order matters. You can always override every choice — but you start from "this is probably already right" rather than from a blank threshold dial.

A proprietary algorithm faster than most APIs

The matching engine is purpose-built for this job — not a wrapper around an off-the-shelf library. It runs faster than most fuzzy-matching APIs on the market and, against local libraries like RapidFuzz or TheFuzz, the difference is even larger at scale. You don't need to know any of this to use it — you just notice that your file comes back quickly.

Flexible cleaning steps, included automatically

Before any matching happens, Clean runs an optional cleaning step on your data — small transformations that make sure the matches are exactly what you want. These are simple on/off toggles, not formulas you have to write:

Lowercase everything so "ACME" and "acme" line up
Strip punctuation so "O'Brien" and "OBrien" match
Remove business suffixes — Inc., LLC, Corp., Ltd., GmbH, S.A., Pty
Handle word-order differences so "John Smith" and "Smith, John" match
Collapse extra whitespace and normalize accented characters

These run in memory just for the comparison — your original data is never modified.

Multi-column matching as one decision

Pick more than one column to match on and Clean combines them into a single similarity score for the whole row. That's what catches "Jen Walsh / Acme Corp" as a duplicate of "Jennifer Walsh / Acme Corporation" when neither field on its own would clear the threshold.

Tweak the results before you download

This is the part people miss in most tools. After Clean runs, you see the matched pairs and their similarity scores in the browser. If too many things matched, slide the threshold up and the table updates live. If borderline pairs got missed, slide it down. Toggle a cleaning step on or off and re-run. Only when the results look right do you click Download.

Just drop the file — no prep

No copy/paste, no "save as CSV", no header reshuffling, no removing empty rows beforehand. Drag your .xlsx, .csv, or Google Sheets export onto the page and Clean handles parsing, sheet selection, header detection, and encoding for you.

Free for small files

Files under 500 rows are completely free — no signup, no credit card, no feature paywall. Fuzzy matching is included on the free tier. Larger files have a small flat fee.

How to do it: step-by-step

Go to similarity-api.com/tools/deduplicate-csv-online. No signup screen. The upload box is the first thing you see.
Drop your Excel or CSV file. Multi-sheet .xlsx works — Clean will ask which sheet to use. UTF-8, Latin-1, and Windows encodings are all handled.
Pick the column (or columns) to match on. For company dedupe, that's the company-name column. For contact dedupe, pick name and company together — Clean will combine them into one similarity score per row.
Glance at the suggested settings. Clean has already read a sample of your data and pre-filled the sensitivity threshold and cleaning toggles. For most company and contact files, the defaults are already right.
Run, then iterate. Look at the matched pairs and their similarity scores. Too aggressive? Slide the threshold up. Missing obvious matches? Slide it down, or turn on "strip business suffixes". Re-run as many times as you want — it's free during this stage.
Download your results. You don't pick a format — Clean generates all three from the same matching run, bundled together (see them in the interactive sample above):
- Clean file — one row per real entity. Drop straight into your CRM or back into Excel.
- Flagged original — your full file, untouched, with a cluster_id and is_duplicate column added. Use when you need to merge manually or audit.
- Review sheet — just the duplicate groups with similarity scores. Best for handing to a colleague to approve before committing.

Dedupe your messy Excel file now

Drop a .xlsx or .csv and find near-duplicates in seconds — no signup, no install, 500 rows free.

Try it for free →

Why three output formats matter

Most deduplication tools — including every Excel add-in — give you one output: a file with "the duplicates removed". That's fine for throwaway data, but it's the wrong default for anything you actually care about. Different situations call for different outputs:

Importing a clean list into a fresh CRM → the clean file. You want one row per entity, no decisions to make.
Cleaning an existing CRM export, then re-importing → the flagged original. You need every row preserved so the mapping back is exact, with a cluster ID so you can merge in your CRM's own tooling.
Cleaning a list someone else owns → the review sheet. You send them only the matched pairs with scores, they confirm or reject, and nothing is committed without sign-off.

All three are generated from the same matching run, so there's no "did I match the same way both times" risk. Excel and most add-ins force you to re-run the entire operation if you want a different shape of output.

Excel vs Clean — at a glance

	Excel (Remove Duplicates / Power Query)	Clean by Similarity API
Catches "Acme Corp" vs "Acme Corporation"	❌	✅
Handles tens of thousands of rows quickly	⚠ Often freezes	✅ Seconds
Multi-column similarity (one score per row)	❌	✅
Strip "Inc./LLC/Ltd." as a toggle	❌ Manual formulas	✅
See pairs + scores before committing	❌	✅
Tweak threshold and re-run instantly	❌	✅
Three output formats from one run	❌	✅
Cost (small file)	Included with Excel	Free under 500 rows

Fuzzy Matching in Excel (2026): Why It's Still Broken — and the Fastest Fix

First, what is fuzzy matching?

Here's what the problem actually looks like

contacts.xlsx — 12 rows, 5 real people

Why can I not delete duplicates in Excel?

1. Fuzzy matching is computationally expensive

2. The add-ins Microsoft shipped never grew up

3. What actually breaks for you, in practice

Why an online tool is the fastest reliable answer in 2026

How Clean does it (and how it's different)

It uses AI to figure out your specific use case

A proprietary algorithm faster than most APIs

Flexible cleaning steps, included automatically

Multi-column matching as one decision

Tweak the results before you download

Just drop the file — no prep

Free for small files

How to do it: step-by-step

Why three output formats matter

Excel vs Clean — at a glance

Frequently asked questions

Fuzzy Matching in Excel (2026): Why It's Still Broken — and the Fastest Fix

First, what is fuzzy matching?

Here's what the problem actually looks like

contacts.xlsx — 12 rows, 5 real people

Why can I not delete duplicates in Excel?

1. Fuzzy matching is computationally expensive

2. The add-ins Microsoft shipped never grew up

3. What actually breaks for you, in practice

Why an online tool is the fastest reliable answer in 2026

How Clean does it (and how it's different)

It uses AI to figure out your specific use case

A proprietary algorithm faster than most APIs

Flexible cleaning steps, included automatically

Multi-column matching as one decision

Tweak the results before you download

Just drop the file — no prep

Free for small files

How to do it: step-by-step

Why three output formats matter

Excel vs Clean — at a glance

Frequently asked questions

Why does Excel's Remove Duplicates miss most of my duplicates?

Is the Microsoft Fuzzy Lookup add-in still usable in 2026?

Doesn't Power Query in Excel already do fuzzy matching?

Why does fuzzy matching freeze Excel?

Can I match on more than one column at the same time?

Is it safe to upload my data to an online tool?

Does it work with .xlsx files, or do I have to convert to CSV?

What if I need to do this regularly, not just once?