Clean — by Similarity API
Dedupe Excel & CSV files
with fuzzy matching
A modern fuzzy lookup for Excel and CSV files. Catches near-duplicates like "Microsoft Corp" vs "Microsoft Corporation" that exact matching misses.
Drop your file here or browse
CSV, XLSX, or XLS · up to 10 MB
How It Works
How to dedupe Excel & CSV files in 4 steps
Upload
Drop your CSV or Excel file. No signup, no install, no data stored.
Auto-configure
Clean analyses your columns and recommends which ones to match on, how strict to be, and how to handle name variations. You can adjust before running.
Review
See duplicate clusters with similarity scores before committing. You decide what to keep.
Download Results
Get your clean file instantly — unique records, flagged clusters, and all rows scored.
Who Uses Clean
Built for messy Excel and CSV exports
From messy CRM exports to subscriber lists with split identities — Clean handles duplicates exact-match tools quietly miss.
E-commerce customer lists
Catches the same buyer registered under two different email addresses — something Excel's Remove Duplicates will never find.
Why Clean
Why Excel, Sheets, and add-ons miss real duplicates
| Excel / Sheets | Sheets Add-ons | Clean | |
|---|---|---|---|
| Catches "Microsoft Corp" vs "Microsoft Corporation" | |||
| AI-recommended matching settings!! | |||
| Matches across multiple columns | Limited | ||
| Works on large files (50k+ rows) | Times out | ||
| No install or account needed | |||
| Shows duplicate clusters before deleting | Limited | ||
| Gives you 3 different results formats | |||
| Flexible data cleaning prior to fuzzy-matching | |||
| Strips "Inc.", "LLC", "Corp." before comparing | Limited | ✓ toggle on/off |
A modern replacement for Excel's Fuzzy Lookup add-in
Microsoft's Fuzzy Lookup add-in is a 2017-era Windows-only download that slows to a crawl past a few thousand rows. And Google Sheets' Remove Duplicates only catches exact-match duplicates. Clean is the browser-based alternative — same fuzzy matching, no install, no Power Query, works on Mac and Windows, and handles up to 100,000 rows per file in both .xlsx and .csv.
See how Clean compares to Fuzzy Lookup →Simple Pricing
Free for small files. Pay only for large Excel & CSV jobs.
Process up to 500 rows for free. Larger files are priced per run.
$0
Up to 500 rows
- Fuzzy deduplication
- Multi-column matching
- Instant download
Large File
$1.99+
501 – 100,000 rows
- Up to 3,000 rows — $1.99
- Up to 10,000 rows — $4.99
- Up to 25,000 rows — $9.99
- Up to 50,000 rows — $19.99
- Up to 100,000 rows — $29.99
Monthly Unlimited
$99.99/mo
Unlimited uploads
- Up to 10 MB per file
- Unlimited file upload / deduplication
- Priority customer support
- Cancel anytime
Learn more
Guides for cleaning your data
Step-by-step articles on deduplicating spreadsheets, CRM imports, and vendor exports.
FAQ
Frequently asked questions
How does Clean actually find duplicates that look different?
Clean compares how similar two records are, not whether they're identical. It gives every pair a score between 0 and 1, and anything above your threshold gets flagged as a likely duplicate. Common real-world messiness — differences in casing, punctuation, abbreviations, and business suffixes like Inc., LLC, and Corp. — is handled automatically before the comparison runs.
When you add a second column — matching on both company name and contact name together — Clean combines the similarity across both fields into a single decision. That's how "Jen Walsh at Acme Corp" and "Jennifer Walsh at Acme Corporation" get grouped as the same person, even though neither field is an exact match on its own. Two weak signals become one strong one.