You can't fix CRM data until you know what's actually broken, and most teams have never looked closely. This CRM data quality checklist is the structured look: a run through the accuracy, completeness, and consistency of your records, covering duplicates, email and phone validity, field completeness, data freshness, formatting, and ownership.
Most B2B databases score between 40% and 60% the first time they go through it. The usual reason is mundane: nobody was ever made accountable for checking, so the data quietly drifted.
Go through each item, note your findings, and use the benchmarks to see where you stand. By the end you'll have a clear picture of your data health and know exactly what needs attention.
How to use this checklist: Run the queries or reports needed to answer each question. Record the percentage or count. Compare against the benchmark to see if you're in good shape, need improvement, or have a critical problem.
📧 Email Quality
Filter contacts where Email is blank. Calculate as percentage of total contacts.
Benchmark: 95%+ should have email for B2B databases (per DAMA data quality standards)
Check your marketing platform for hard bounce percentage over the last 90 days.
Benchmark: Under 2% is good, 2-5% needs attention, over 5% is critical (according to Mailchimp's email benchmarks)
Count emails ending in gmail.com, yahoo.com, hotmail.com, etc. (for B2B databases)
Benchmark: Under 10% for B2B. Higher suggests low-quality lead sources.
Search for patterns like test@, fake@, asdf@, noemail@, or repeating characters.
Benchmark: Should be near 0%. Any significant number indicates form abuse or lazy data entry.
📱 Phone Data
Count contacts where Phone is populated. Include mobile if tracked separately.
Benchmark: 60-80% for most B2B databases. Lower limits outbound calling effectiveness.
Check if numbers follow a single format or are mixed (with/without country codes, dashes vs. dots, etc.)
Benchmark: All numbers should follow one standard format for click-to-dial and deduplication.
Look for numbers with wrong digit counts, obviously fake patterns (555-555-5555), or non-numeric characters.
Benchmark: Under 5%. Higher suggests data entry problems or bad list imports.
👤 Contact Completeness
Count contacts where Title/Job Title field is populated.
Benchmark: 80%+ for effective lead scoring and personalization.
Check for variations: VP Sales, VP of Sales, Vice President Sales, Sales VP. How many unique values exist?
Benchmark: Use a controlled vocabulary or mapping. Dozens of variations for the same role breaks segmentation.
Count contacts with no Account association (orphaned contacts).
Benchmark: 95%+. Orphaned contacts break ABM, routing, and reporting.
Check City, State/Province, Country fields. What percentage are complete?
Benchmark: Depends on use case. Critical if you do territory-based routing or regional marketing.
🏢 Account/Company Data
Count accounts where Industry is blank or "Unknown".
Benchmark: 80%+ for effective segmentation and routing.
Count accounts missing employee count or company size indicator.
Benchmark: 75%+ for proper lead routing and scoring.
Find accounts with zero contacts linked. These are often junk or incomplete records.
Benchmark: Under 10%. High numbers suggest account creation without follow-through.
Search for variations: Acme, Acme Inc, Acme Inc., Acme Corporation, ACME. How fragmented is your data?
Benchmark: Each company should appear once with a consistent name format.
👥 Duplicates
Run duplicate detection on Email (exact match) and Name + Company (fuzzy match).
Benchmark: Under 5% duplicate rate. Over 15% indicates serious problems (per Gartner data quality research).
Run duplicate detection on Company Name (with fuzzy matching for variations) and Website domain.
Benchmark: Under 5%. Duplicate accounts cause rep conflicts and broken reporting.
Compare Lead emails against Contact emails. How many leads exist as contacts?
Benchmark: Leads should convert to Contacts, not coexist. Any matches are process failures.
📅 Data Freshness
Filter by Last Modified Date or Last Activity Date older than 12 months.
Benchmark: Under 30%. Higher suggests stale data that may be decayed.
Filter by Last Activity Date older than 24 months. These records are likely 50%+ inaccurate (B2B contact data decays at approximately 22-30% annually).
Benchmark: Under 15%. These should be flagged for verification or archiving.
Check if you track enrichment dates. If not, assume firmographic data is decaying.
Benchmark: Company data should be refreshed at least annually (according to DAMA data freshness guidelines).
⚙️ Process Health
Check if your CRM enforces email format, required fields, picklist values.
Benchmark: At minimum, validate email format and require key fields on record creation.
Check if your CRM warns or blocks when creating potential duplicates.
Benchmark: Duplicate rules should be active for Contacts, Leads, and Accounts.
Is there a person or team accountable for data quality metrics and maintenance?
Benchmark: Someone should own data quality as part of their defined responsibilities.
Count how many items fall outside the benchmarks. Focus on Critical items first, then Warning, then Info.
What To Do Next
Now that you know where your problems are, prioritize based on business impact:
Fix first: Issues affecting revenue operations, lead routing failures, duplicate accounts causing rep conflicts, bounce rates damaging sender reputation.
Fix second: Completeness issues, missing data that limits segmentation, scoring, or personalization.
Fix third: Standardization and formatting, important for long-term maintenance but less urgent than accuracy issues.
For most companies with significant issues, a one-time cleanup project makes sense before establishing ongoing CRM hygiene. Trying to maintain a fundamentally broken database is like mopping while the faucet's running.
Want a professional assessment of your data quality?
Get a Free Data AssessmentRelated: How to Build a Data Hygiene Strategy | The Cost of Bad CRM Data | Data Validation Services | Data Cleaning Services
Need help with your data?
Run the checklist on your own data, then send us the sample and we'll grade it against what we usually see.
See What We'll FindAbout the Author
Rome Thorndike is the founder of Verum. He led sales at Datajoy (acquired by Databricks) and Snapdocs and built ML algorithms at Microsoft, so most of his career has been spent either relying on CRM data or cleaning up after it.