Sanka

Clean HubSpot data with Sanka

Use Claude or Codex to detect duplicate HubSpot companies and contacts, missing fields, lifecycle mismatches, and broken associations, then review safe fixes before they sync back to HubSpot.

Last updated: June 1, 2026

This guide shows how to keep HubSpot as the CRM while using Sanka to run continuous data cleansing. Start by asking Claude or Codex to scan HubSpot and prepare recommendations, then review and approve fixes before any record is merged, normalized, reassigned, or synced back.
Claude/Codex
Scan HubSpot for duplicate companies and contacts, missing domains and key fields, wrong lifecycle stages, and broken associations. Prepare recommendations only; do not merge or update records yet.
Preparing HubSpot cleansing queueI found duplicate candidates, records missing key fields, and ownership and lifecycle mismatches in HubSpot. Please review the recommended fixes before applying any changes.
Ask for another cleansing check...

Before you start

Check that you have the following ready.
  • HubSpot is connected to Sanka
  • HubSpot companies, contacts, and associations are visible in Sanka
  • You have decided which fields HubSpot is the source of truth for, and which come from the back office or enrichment
  • Your team has rules for merge thresholds, normalization, owner reassignment, and HubSpot writeback
  • You have permission to merge, edit, and sync HubSpot records

Ask AI to scan HubSpot

Open Claude or Codex, point it at HubSpot, and ask for a cleansing scan first. Keep this step as recommendations only so RevOps can review before anything changes.
Sample prompt
/sanka Scan HubSpot companies and contacts for duplicates, missing domains, missing key fields, wrong lifecycle stages, and broken associations. Group duplicate candidates and score each issue by revenue impact and recency. Do not change anything yet.
Ask the AI to flag:
  • Duplicate companies and contacts, grouped into candidate sets
  • Missing domains, billing fields, owners, or other required properties
  • Wrong or stale lifecycle stages and owner mismatches
  • Broken or missing associations between companies, contacts, and deals
  • Conflicts where HubSpot and the back office disagree on the same field

Review the cleansing queue

Issues land in a queue scored by impact. Review each group before you let any change through.
  • Confirm the duplicate sets are genuinely the same company or contact
  • Decide which record is the survivor for each merge
  • Check that the normalization rules (company name casing, domain format, phone format) match your standard
  • Set the source-of-truth field policy for any conflicts

Resolve by rule

After the review is clean, ask AI to apply the fixes using the rules you set — one record at a time or in bulk.
Sample prompt
/sanka Merge the approved duplicate sets, keeping the survivor I selected, normalize company names and domains, and reassign owners by the routing rule. Keep a snapshot of each original record.
Keep these inside an auto-apply threshold:
  • Only auto-merge inside the threshold you set (for example, exact domain match plus matching billing address)
  • Anything below the threshold stays in the queue as a recommendation, not an action
  • Every merge keeps the original record snapshots so it can be reversed within the retention window

Write back to HubSpot with an audit trail

Approved changes sync back to HubSpot so the CRM stays clean and finance stays aligned.
Sample prompt
/sanka Apply the approved fixes and update HubSpot. Write back the merged records, normalized fields, and reassigned owners, and log the reason, reviewer, and timestamp for each change.
Check after writeback:
  • Survivor records carry the right associations and properties
  • Merged records are no longer surfaced as duplicates
  • Each change has a reason, reviewer, and timestamp in the audit trail
  • HubSpot reflects the normalized values and reassigned owners

Run it continuously

Set the cleansing loop to run on a schedule so duplicates and gaps are caught as they appear, instead of in a once-a-quarter cleanup. RevOps reviews the queue; Claude or Codex does the detection and the approved writeback.