Clean — by Similarity API

Compare two CSVs orExcel files in minutes

A fuzzy VLOOKUP that actually works — even when names, casing, and formatting differ.

File A — Your list

Drop file or browse

CSV, XLSX, or XLS · up to 10 MB

File B — Reference list

Drop file or browse

CSV, XLSX, or XLS · up to 10 MB

Cross-match two files — find overlaps & uniques

How It Works

How to compare two CSV & Excel files in 4 steps

Step 1

Upload

Drop your two CSV or Excel files. No signup, no install, no data stored.

Step 2

Auto-configure

Clean analyses your columns and recommends which ones to match on, how strict to be, and how to handle name variations. You can adjust before running.

Step 3

Review

See matched pairs with similarity scores before committing. You decide what to keep.

Step 4

Download Results

Get your results instantly — matched pairs, unmatched rows, and all scores.

Who Uses Clean

Built for messy Excel and CSV exports

From messy CRM exports to subscriber lists with split identities — Clean handles duplicates exact-match tools quietly miss.

E-commerce customer lists

Catches the same buyer registered under two different email addresses — something Excel's Remove Duplicates will never find.

Why Clean

Why VLOOKUP and XLOOKUP miss real matches

VLOOKUP / XLOOKUPPower Query Fuzzy MergeClean
Catches "Microsoft Corp" vs "Microsoft Corporation" across two files
AI-recommended matching settings!!
Matched output: every File A row paired with its best match in File B + similarity scoreLimited
Unique-to-File-A output: rows in File A with no match in File B (net-new, safe to import)
Annotated output: original File A with match status + similarity score added
Match across multiple columns (different column names per file OK)Limited
Strips "Inc.", "LLC", "Corp." before comparingLimited✓ toggle on/off
Works on large files (50k+ rows combined)Times out
Browser-based — no formulas, no Power Query, no add-in

What VLOOKUP returns vs what Clean returns

File A row

Jen Walsh, Acme Corp

File B has

Jennifer Walsh, Acme Corporation

VLOOKUP / XLOOKUP→ #N/A (strings differ)
Clean reconcile→ Match · 0.91 · "Jennifer Walsh, Acme Corporation"
See how the reconcile tool compares to VLOOKUP →

Simple Pricing

Free for small files. Pay only for large Excel & CSV jobs.

Process up to 500 rows for free. Larger files are priced per run.

$0

Up to 500 rows

  • Fuzzy deduplication
  • Multi-column matching
  • Instant download
Most Popular

Large File

$1.99+

501 – 100,000 rows

  • Up to 3,000 rows — $1.99
  • Up to 10,000 rows — $4.99
  • Up to 25,000 rows — $9.99
  • Up to 50,000 rows — $19.99
  • Up to 100,000 rows — $29.99

Monthly Unlimited

$99.99/mo

Unlimited uploads

  • Up to 10 MB per file
  • Unlimited file upload / deduplication
  • Priority customer support
  • Cancel anytime

NEED MORE?

Interested in deduping larger files?

Our API handles millions of rows with sub-second matching, bulk uploads, and programmatic access. Or reach out and we'll walk you through a custom solution — free of charge.

Learn more

Guides for matching two lists

Step-by-step articles on reconciling CRM imports, trade-show lists, and vendor exports.

FAQ

Frequently asked questions

How does the reconcile tool actually find matches when names look different?

VLOOKUP and XLOOKUP only catch matches when two values are character-for-character identical. "Microsoft Corp" and "Microsoft Corporation" are different strings, so VLOOKUP returns #N/A.

The reconcile tool compares how similar two records are, not whether they're identical. It scores every File A row against File B with a number between 0 and 1, and anything above your threshold is flagged as a match. Casing, punctuation, abbreviations, and business suffixes (Inc., LLC, Corp.) are normalised automatically before the comparison runs.

Selecting a second column on each side — e.g. name + company — combines both signals into one match decision. That's how "Jen Walsh at Acme Corp" in File A correctly matches "Jennifer Walsh at Acme Corporation" in File B, even though neither column is an exact match on its own.

What's the difference between deduplicating and reconciling two lists?

Deduplication finds duplicate records within a single file — two rows in the same spreadsheet that represent the same contact or company. Reconciliation compares two separate files — checking which rows in your new list (File A) already exist in your reference list (File B), and which are genuinely new. Use Clean when you have one messy file to clean up before importing to your CRM. Use the reconcile tool when you have a new list — a trade show export, an Apollo download, a vendor list — and want to check it against an existing database before importing.

What's the difference between the reconcile tool and VLOOKUP?

VLOOKUP only matches on exact values — "Jen Walsh" and "Jennifer Walsh" return no match. The reconcile tool scores similarity between strings, so name variants, abbreviations, and company formatting differences are all caught. The reconcile tool also matches on multiple columns simultaneously, so "Jen Walsh at Acme Corp" correctly matches "Jennifer Walsh at Acme Corporation" even though neither field is identical on its own.

What's the difference between the reconcile tool and XLOOKUP?

XLOOKUP is more flexible than VLOOKUP — it can search left or right and return cleaner errors — but the underlying match is still exact. "Acme Corp" against "Acme Corporation" is still no match. The reconcile tool replaces the exact-match step with a similarity score (0–1), so all the common variants — abbreviations, casing, punctuation, entity suffixes, and minor typos — are caught without writing wildcard formulas or helper columns.

What's the difference from Power Query's fuzzy merge?

Power Query's fuzzy merge is Windows-only Excel desktop, slows dramatically past a few thousand rows, has a single similarity slider with limited control, and doesn't natively split your output into matched vs net-new. The reconcile tool runs in any browser, scales to 100,000 rows per file, lets you tune threshold and entity-suffix stripping independently, and ships three output formats out of the box: matched, unique-to-File-A (net-new), and a fully annotated copy of File A.

How do I do a fuzzy VLOOKUP between two Excel files?

Open the reconcile tool, drop both .xlsx files into the uploader, pick the column to match on in each file (the column names don't have to match — you select them independently per file), and run. The reconcile tool returns each row from File A with its best fuzzy match in File B and a similarity score. Unlike VLOOKUP, it catches "Jen Walsh" matching "Jennifer Walsh" and "Microsoft Corp" matching "Microsoft Corporation" — no formulas, no Power Query, no add-in.

Can I match contacts across two files when only one column overlaps?

Yes. The reconcile tool lets you pick matching columns independently in each file, so you can match File A's "Full Name" column against File B's "Contact" column even though the column names differ. You can also select multiple columns per file (e.g. name + company on both sides) — the tool combines the similarity across all selected columns into a single match decision, which dramatically reduces false positives versus matching on one column alone.

How do I find which contacts on a trade-show list are already in our CRM?

Export your CRM contacts to CSV, drop both your trade-show list and the CRM export into the reconcile tool, and pick the column(s) to match on (typically contact name plus company, or email). You'll get a net-new file (contacts safe to import — no match in the CRM) and a matched file (already exist — review or suppress before import). Fuzzy matching catches the same person spelled differently across the two systems, which is the failure mode of every email-based dedupe check.

Which file should be File A and which should be File B?

File A is the new list you want to check — a trade-show export, an Apollo or ZoomInfo download, a vendor or partner list, anything you're about to import. File B is your existing reference — your CRM export, customer database, or current contact list. The output is structured around File A: which rows in A already exist in B (matched, suppress before import) and which rows in A are net-new (safe to import). Get this backwards and the output won't make sense, so pick A = new list, B = source of truth.

What similarity threshold should I use?

The default is 0.80 for reconciliation — slightly higher than dedupe defaults — because a false positive here means suppressing a genuinely new contact, which is more damaging than missing a duplicate. Go higher (0.88+) if you want to be conservative. Go lower (0.75) if your data is clean and you want to catch more variants.

What do the three output formats mean for reconciliation?

Matched file: rows from File A that matched something in File B, with the best match and similarity score added — for review or suppression before import. Unique-to-File-A file: rows from File A that had no match in File B — these are net-new, safe to import. Annotated file: every row from File A with three added columns — match status, similarity score, and the best match found — useful if you want to make your own decisions on borderline cases.

How is pricing calculated for the reconcile tool?

Pricing is based on the combined row count across both files — File A rows plus File B rows, excluding headers. For example, a 400-row trade show export checked against a 2,000-row CRM export counts as 2,400 rows total. Free for combined totals up to 500 rows with no account required. For larger combinations: $1.99 up to 3,000 rows combined, $4.99 up to 10,000 rows, $9.99 up to 25,000 rows, $19.99 up to 50,000 rows, and $29.99 up to 100,000 rows. You can preview your results before paying — payment is only required to download.

Can the reconcile tool match on multiple columns?

Yes. Select matching columns independently in each file (the column names don't need to match across the two files), and the tool combines the similarity across all selected columns into a single match decision. "Jen Walsh at Acme Corp" matching "Jennifer Walsh at Acme Corporation" works because the combined name and company similarity is strong even though neither field is an exact match on its own.

Is my data safe to upload?

Both files are processed in memory and deleted immediately after your session. They are never written to permanent storage, never shared, and never used for any purpose other than generating your results. You can verify this in our privacy policy.

What file formats are supported?

CSV, XLSX, and XLS. Maximum 10 MB per file. If your files are larger, contact us — we can run them via the API.