How to Audit Your Salesforce Data Quality in 5 Steps
Feb 17, 2026
How to Audit Your Salesforce Data Quality in 5 Steps
Most teams assume their Salesforce data is "pretty good." The audit usually proves otherwise.
This is not a judgment — it is a structural reality. Salesforce was built to store data. It was not built to keep that data accurate, fresh, or consistent over time. The moment records are created, they start degrading. Job titles change. Contacts switch companies. Emails go stale. Duplicate accounts accumulate because two reps entered the same company with slightly different names. Fields that were required at import get bypassed by reps in a hurry.
The gap between what leadership assumes about CRM quality and what the data actually shows is almost always significant. The audit is not about assigning blame. It is about getting a number you can act on.
Here is how to run a complete Salesforce data quality audit in a single day — and what to do with what you find.
Why Salesforce Data Degrades Faster Than You Think
The 2%-per-month degradation rate is not theoretical. According to research from data providers including Dun & Bradstreet and Salesforce's own published estimates, B2B contact data decays at roughly 25–30% per year when left unmanaged. That rate accelerates during periods of economic uncertainty, layoffs, or rapid hiring — exactly the conditions that have characterized the last several years of B2B markets.
At a 25% annual decay rate, a 20,000-record CRM that was perfectly accurate on January 1 has 5,000 degraded records by December 31. Not gradually obvious — quietly broken.
Five structural factors drive most of the degradation:
1. Rep non-compliance with data standards. Reps create records under time pressure. Required fields get entered with placeholder values ("N/A", "Unknown", "123-456-7890"). Fields that are not required get left blank entirely. Over time, a CRM that was designed with a clean data model accumulates thousands of records that technically exist but functionally do not.
2. No enrichment layer. Without an ongoing enrichment process, records only reflect what was known at the moment of creation. A contact imported from a list three years ago still has the title, company, and phone number from that list — regardless of what has changed since.
3. No deduplication rules in place. Salesforce's native duplicate detection is limited. It flags obvious matches — exact name and email — but misses records that share a domain and phone number under different name spellings. Without active deduplication logic, every import and every rep-created record adds entropy.
4. Stale enrichment from one-time imports. Many teams run a one-time enrichment — buying a ZoomInfo or Apollo batch export and importing it into Salesforce. The data is accurate at import. Within six months, it degrades to the same state as before. One-time enrichment buys time. It does not solve the problem.
5. No governance policy. Without defined field ownership, required standards, and regular review cycles, CRM hygiene defaults to nobody's job. Every team assumes someone else is managing it. Nobody is.
Understanding these root causes matters because the audit's final output is not just a score — it is a diagnosis. Knowing which of these five factors is primarily responsible for your data quality state shapes the remediation strategy.
Before You Start: What You Are Auditing For
A useful data quality audit measures four distinct dimensions. Each has its own failure modes and remediation approach, so conflating them produces an average that obscures more than it reveals.
The Four Dimensions of CRM Data Quality
You need numbers on all four. A CRM can be complete (all fields filled) and inaccurate (all fields wrong). It can be accurate at a point in time and stale (accurate 18 months ago, unknown since). The full picture requires all four measurements.
The 5-Step Audit
Step 1: Run a Completeness Report
Start with what Salesforce can tell you natively. Build a report — or a series of reports — that shows field population rates for the fields that matter most to your go-to-market operation.
The critical fields to measure:
Email address (primary)
Phone number (direct or mobile preferred)
Job title
Account name (associated account)
Lead source or account source
Last activity date
For each field, pull the percentage of contact records where the field is populated with a non-null, non-placeholder value. Placeholder detection requires a filter: exclude records where the field contains "N/A", "Unknown", "TBD", "000-", or similar patterns your team uses as workarounds.
How to build this in Salesforce: Go to Reports > New Report > Contacts. Add each field as a column. Use a summary report grouped by the presence or absence of each field. Alternatively, use Salesforce's built-in Field Audit Trail or a third-party inspection tool to generate a completeness matrix across your full contact object.
What you are looking for: Any field that is below 80% populated is a material gap. Email below 90% is a serious problem. Title below 70% means your segmentation and personalization are working from guesswork.
Step 2: Check Accuracy
Completeness tells you what fields are filled in. Accuracy tells you whether those values are correct. This step cannot be fully automated — it requires human verification against an external source.
The method is straightforward, if time-consuming: pull a random sample of 100 contact records from your CRM. For each, open their LinkedIn profile and compare the following:
Current title (does it match the CRM record?)
Current employer (are they still at the company listed?)
Is the person still at the company at all?
Record the results: accurate, inaccurate, or no longer at company. Tally the three categories. This gives you a directional accuracy rate.
Sampling considerations:
Pull from across your record age distribution — not just recent records
Include records from different lead sources (trade show lists, web form captures, purchased lists, rep-entered data)
Weight toward records that have been in the CRM for 12+ months, where degradation is most likely
A sample of 100 is sufficient for a directional read. For a formal audit with statistical confidence, 300–500 records gives you a tighter margin. The manual work is real — this step takes three to four hours — but the accuracy rate it produces is the most important single number in the audit.
Benchmark: If more than 20% of your sampled records are inaccurate or have departed the company, your data quality problem is significant and growing.
Step 3: Find Duplicates
Duplicate records are one of the most operationally damaging data quality issues — and one of the most systematically undercounted. Most teams know they have some duplicates. Few know how many.
Two methods to run simultaneously:
Method A: Salesforce Native Duplicate Detection Go to Setup > Duplicate Management > Duplicate Rules. If you do not have rules configured, configure them now for both Contacts and Accounts using email (for contacts) and website/domain (for accounts) as matching criteria. Run the Duplicate Error Log report to see flagged matches.
Limitation: Salesforce's native detection only catches exact or near-exact matches. It misses fuzzy duplicates — records where names are spelled differently but email domains match, or where phone numbers match across records with variant company name spellings.
Method B: Domain and Name Matching Report For accounts, pull a report showing all account records with their associated website domain. Export to Excel or Google Sheets. Sort by domain. Any domain that appears more than once has at least one duplicate account. Investigate each cluster manually.
For contacts, pull all contacts with the same email domain and similar names. Cross-reference against LinkedIn where ambiguous.
What to look for:
Accounts with the same domain listed under different names ("Acme Corp", "Acme Corporation", "Acme, Inc.")
Contacts with the same email address on separate records (common after list imports)
Opportunities linked to duplicate accounts — these will corrupt pipeline reporting
Benchmark: A duplicate rate above 5% on accounts is a significant problem. Above 10% means your territory assignments, pipeline reporting, and forecasting are all compromised.
Step 4: Measure Staleness
Completeness and accuracy measure the quality of the data in your records. Staleness measures how recently that quality was verified. A record that was accurate 18 months ago and has not been touched since is a liability — you do not know whether it is still accurate.
How to measure staleness in Salesforce:
Build two reports:
Contacts not modified in 6+ months: Filter contacts where "Last Modified Date" is before [today minus 180 days]. Calculate the percentage of your total contact database.
Contacts not modified in 12+ months: Same filter with [today minus 365 days].
Also run this for the "Last Activity Date" field — which captures the last logged call, email, or meeting. A contact can be "modified" because a field was programmatically updated while having no actual rep engagement for years.
Reading the results:
Pay particular attention to accounts in your ICP that fall into the stale category. A stale record on a company that is not in your ICP is low priority. A stale record on a 500-person SaaS company that should be a target account is a missed opportunity.
Step 5: Identify the Source of Bad Data
The four previous steps give you a score. This step gives you a diagnosis. You cannot fix a data quality problem permanently without understanding where it originates.
Review your findings against the five root causes from earlier in this article. The pattern in your data tells you which factor is dominant:
Document the primary driver. This determines which part of the remediation strategy matters most. If rep compliance is the main problem, workflow enforcement and training matter. If stale enrichment is the main problem, you need an ongoing enrichment layer. If duplicates are concentrated around import events, you need pre-import deduplication logic.
What a "Good" Audit Result Looks Like
Not every organization is starting from the same baseline. Here are the benchmarks that indicate a CRM in reasonable operational health:
If you are hitting all of these benchmarks, your CRM data quality is above average and your remediation priorities are maintenance rather than transformation.
Most teams are not hitting all of these benchmarks. If you are below benchmark on accuracy and staleness — the two most consequential dimensions — and your database is more than 18 months old without ongoing enrichment, you are likely operating with a materially degraded CRM. The cost implications of that are covered in detail in our companion article on calculating CRM data quality ROI.
The Three Paths After the Audit
Once you have the numbers, you have three options. They are not equally effective.
Path 1: Manual Cleanup
The RevOps team or a data contractor goes through the CRM and corrects records. This is the right choice for very small databases (under 5,000 records) or as a one-time remediation before a major campaign launch. It is not a sustainable strategy for a database of any meaningful size. Manual cleanup treats data quality as a project, and projects end. Data degradation does not.
Path 2: Point-Solution Enrichment
You run an enrichment import through a tool like ZoomInfo, Clearbit, or Apollo. Accuracy improves significantly at the moment of import. Staleness resets to zero. Then degradation begins again. Within six months, you are back to a meaningful percentage of stale or inaccurate records — especially for contacts in high-turnover roles (SDRs, BDRs, entry-level ops).
Point solutions also do not solve the deduplication problem. They add cleaner data on top of existing records without resolving whether those records should be merged. And they require a human to initiate the refresh — they do not run autonomously.
Path 3: Continuous Automated Enrichment
The only approach that keeps data quality above the operational threshold permanently is one where enrichment, deduplication, and field updates run as an ongoing automated process — not a quarterly project. This requires an agent-based architecture where the enrichment layer is always on, not periodic.
This is the approach that matches the physics of the problem. Data degrades continuously. The system that manages it needs to run continuously.
What Lantern's CRM Cleaning Agents Do Differently
Lantern's CRM cleaning agents are built on the continuous enrichment model. Here is specifically what that means in practice:
Multi-source enrichment without vendor management. Lantern pulls from 100+ enrichment sources simultaneously. Rather than requiring you to manage separate subscriptions to ZoomInfo, Clearbit, Bombora, and LinkedIn Sales Navigator, a single agent resolves the best available data across all sources using waterfall logic — filling fields in priority order based on source confidence and recency.
Scheduled, autonomous operation. Agents run on a configured schedule — daily, weekly, or triggered by specific events (a contact's email bounces, a company changes domain, a rep logs an activity on a stale record). No human intervention required. No ticket to open. No analyst to task.
Deduplication built into the enrichment cycle. Every enrichment run includes a deduplication pass. The agent does not just update fields on existing records — it identifies merge candidates using multi-field fuzzy matching and resolves them according to configured business rules (which record is master, how to handle conflicting field values, how to reassign opportunities and activities).
Real-time write-back to Salesforce. Updated fields, merged records, corrected ownership assignments — all changes flow back into Salesforce automatically. There is no export-import cycle. Reps see current data without taking any action.
Forward-deployed engineers, not a support queue. Lantern's engineers configure the initial agent setup and ongoing optimization in a dedicated Slack channel with your team. When your territory logic changes or a new enrichment use case emerges, the configuration is updated within hours — not weeks.
The practical result: the audit you run today produces a different result in 90 days with a Lantern agent running continuously than it does without one. The numbers improve and stay improved.
Run This Audit This Week
The audit described here takes one to two days for a RevOps analyst with Salesforce report access. The output — completeness rates, accuracy rate, duplicate count, staleness rate, and root cause diagnosis — is everything you need to have an intelligent conversation about data quality investment with your leadership team.
Most teams that run this audit are surprised by what they find. The completeness numbers are usually lower than expected. The accuracy rate from the manual sample is almost always lower than expected. The staleness rate is often higher than expected, especially on contacts associated with ICP accounts that have not been actively worked.
Run the audit. Get the numbers. Then decide what they justify.
If your numbers are above benchmark across all four dimensions, congratulations — you have a data quality program worth preserving. If they are not, the question is not whether to fix it. It is whether to fix it once or fix it permanently.
Talk to Lantern About Your Results
Run this audit this week. If you do not like what you find, let's talk about what a Lantern agent would do with those records.
We will show you specifically — using your data — what continuous enrichment, deduplication, and write-back would produce over 90 days. No generic demos. No hypothetical case studies. Your CRM, your records, your numbers.
