How to Merge Duplicates across HubSpot and Salesforce While the Sync Is Active
Your sales team is using both HubSpot and Salesforce and is running into problems with duplicate records in one CRM or the other, and sometimes both.
You have a data sync set up between the two systems, but don't know how to deduplicate effectively to ensure the cleanup effort is consistent across CRMs. In addition, HubSpot doesn't allow you to deduplicate companies while the sync is active from within the HubSpot app.
With Insycle's Merge Duplicates you can flexibly merge duplicate people and companies in bulk and automatically (including through workflow automation right when visitors fill out a form) even when the HubSpot-Salesforce sync is active, and keep the master records syncing after the merge. You can also control the merge process by defining rules for picking the master record and master fields values (for example, retain owner from the contact that was created first).
- Set up your HubSpot-Salesforce sync settings.
- Create a custom field in each CRM to identify the master record.
- Deduplicate your Salesforce records.
- Deduplicate your HubSpot records.
First, make sure your HubSpot and Salesforce sync settings are set up for this process to work.
In HubSpot, navigate to Settings > Integrations > Connected Apps > Salesforce "Actions" Button > Go to settings.
On the app settings page, click the Sync Settings tab, then the Salesforce → HubSpot sub-tab.
To ensure that duplicates are not automatically deleted when you merge in Salesforce, make sure the When a Salesforce [object] is deleted settings are as follows:
- When a Salesforce lead is deleted → Do nothing In HubSpot
- When a Salesforce contact is deleted → Do nothing In HubSpot
Later, you can use Insycle to merge the corresponding records in HubSpot so that the sync remains active.
To label records that are deemed the master for each set of duplicates, you'll need to create a custom field in both platforms. In each CRM, Salesforce and HubSpot, add a custom field named “Deduplication Master Record.” This needs to be added to any synced record/object type that you plan to deduplicate.
Insycle will automatically populate this field with the correct value. To prevent users from accidentally changing its value, you may want to hide this field from the default layout or make it non-editable from the view.
Add the Custom Field in Salesforce for Each Object Type
In Salesforce navigate to Setup > Objects and Fields > Object Manager. Select the object type, click Fields & Relationships, then click the New button.
Enter the following properties:
- Field Label: Deduplication Master Record
- Field Name: Deduplication_Master_Record
- API name: Deduplication_Master_Record__c
- Data type: Checkbox
Repeat these steps to add the Deduplication Master Record field to each object type synced with HubSpot that you'll need to deduplicate.
Add the Custom Field in HubSpot for Each Object Type
In HubSpot, navigate to Settings > Objects > select the object type > Manage [object] properties, and click the Create property button.
Enter the following properties:
- Label: Deduplication Master Record
- API name: deduplication_master_record
- Data type: Single checkbox
Repeat these steps to add the Deduplication Master Record field to each object type synced with Salesforce that you'll need to deduplicate.
Set Property Mapping for Deduplication Master Record Field
Next, you need to set the object settings to copy the value of the custom field from Salesforce into HubSpot (one way).
In HubSpot, navigate to Settings > Integrations > Connected Apps > Salesforce, and select the object type tab.
Click the Add new field mapping button and use the dropdown menus to select the "Deduplication Master Record" HubSpot property and Salesforce fields.
For the Sync Rule, select Always use Salesforce.
Follow the same process for each synced record/object type.
Now you can start the process of merging Salesforce duplicates with Insycle.
In the Merge Duplicates module, go through the deduplication process as you would if the sync wasn't in place.
In Step 1 choose Salesforce fields and criteria the values must meet to be considered a duplicate.
In the example below, we are looking for Salesforce contacts with the exact same First Name AND Last Name AND Email Domain.
Under Step 4, configure the rules that specify which record from each set of duplicates should become the master—the record that will remain after the merge that all the other duplicate records will merge into.
After Insycle has identified the master record, it will use the selection rules from the Field tab to automatically pick which values from a duplicate group will be used in the master record.
For each field you want to control the data retention for, you need to tell Insycle where the data for the field should be taken from. This is merged into the master. Any data that is not in the master or not copied to the master is lost when the record is merged.
As part of the merge process, Insycle will automatically populate the “Deduplication Master Record” field with the value “TRUE” for the record that is chosen as the master.
For further instructions on configuring your deduplication, see the Bulk Merge Duplicate People, Companies article.
To finish deduplicating your Salesforce records, continue with the Preview Deduplication Changes then Apply Merge to CRMs step below.
After the merge in Salesforce, the “Deduplication Master Record” values will automatically sync from Salesforce to HubSpot. This field can then be used to identify the same record as the master in HubSpot.
Under Step 1, use the same criteria as in the Salesforce deduplication to determine what HubSpot records should be considered duplicates.
Under Step 4, configure one rule—records with a Deduplication Master Record value of, "True" (or "Yes," depending on the setup) should be selected as the master.
This will ensure that the master record on HubSpot aligns with the master record on Salesforce. The “Deduplication Master Record” value is available in HubSpot due to the sync.
Preview Merges in CSV Report
After you have the deduplication rules set up for each CRM, you should preview the changes you are making to your data. That way, you can check to ensure your merge configuration is working as expected before those changes are pushed to your live database.
Under Step 5, click the Review button and select Preview mode.
Click the Next button to go to the Notify screen, where you can select recipients for the email report. You can also add additional context to the message.
On the When tab, click the Run Now tab, and select which records to apply the change to (in most cases this will be All), then click the Run Now button.
Insycle will generate a preview CSV and send it to your email. Open the CSV file from your email in a spreadsheet application.
In the CSV, the Result column identifies which records were picked as the master and which were identified as duplicates and merged into the master. You'll see the values:
- Duplicate – The record is part of a duplicate group.
- Master – The master record that was chosen for the duplicate group based on your rules.
- Master (After) – For each duplicate group, the Result column will show the data the final record will contain, based on master selection and field data retention settings.
- Error – If Insycle was not able to determine which record would be the master, an error message will appear here. See the Troubleshooting section below for more detail.
When a field value in the CSV says "(Default)," it means that the CRM will be using its default processes for dealing with the field. This is typically done for blank fields, system IDs, and other specific situations.
If everything in the Result column looks correct, return to Insycle and move forward with applying the changes.
Apply Changes to Your CRM Records
When you're satisfied with the results in your preview, you can merge the records in your CRM.
Under Step 5, click the Review button, and this time select Update mode.
On the When tab, you should use Run Now the first time you apply these changes to the CRM.
Save Templates and Setup Automation to Maintain Formatting
After you've seen the results in the CRM and you are satisfied with how the operation runs, you can set up ongoing automated deduplication for HubSpot and Salesforce records with Insycle templates, or integrate with Workflows on the HubSpot side.
With automation, you'll save time and ensure that HubSpot and Salesforce are consistently deduplicated while keeping the sync active.
Deduplicating across HubSpot and Salesforce while the sync is active is a bit tricky.
When you merge duplicates, you combine the record data into a single master record. If you merge records in two platforms, you must ensure that all duplicates in both platforms are merged into the same master record. When the master record differs, this breaks the sync between the two platforms. At the same time, you need to keep all of those records synced throughout the process.
By creating a custom field synchronized across both CRMs that says, "This is the master record!" you ensure both CRMs use the same master record when the deduplication process is run in Insycle.
The CRM you deduplicate first will set the master, so you'll set up complete master selection rules in Step 4 of the Merge Duplicates module. When you run this merge duplicates operation, Insycle will set the "Deduplication Master Record" field value to "True/Yes" on the record identified as master.
When you run the merge operation on the second CRM, all you need to select the same master record is the "Deduplication Master Record" value of "True/Yes."
Pick a field that you think has some duplicate values.
Running a very simple match operation like just First and Last Name can be helpful in giving you an idea of what you have, but it is too broad to use for reliable analysis and deduplication. There may be legitimate duplicate names–different people with the same first and last name. You need additional, unique criteria to narrow it down.
Choosing Unique Identifiers
Matching duplicates requires unique identifiers—data that is unlikely to be shared by any other record unless it is a duplicate. If you don't use unique identifiers, you are likely to identify unrelated records as duplicates and may accidentally merge them.
Many CRMs match first names, last names, and email addresses. If all of those match, or are similar, you can confidently determine that the record is a duplicate.
Other unique identifying fields that are commonly used in deduplication include:
- Phone number
- Mailing address
- ID numbers
Define what kind of likeness to look for when deciding if field values should be considered a match.
It's a good idea to start with Exact Match, and begin with easy-to-find duplicates. Iterate through fields and rules you know will surface duplicates, then look for edge cases. Similar Match can be helpful for those edge cases.
- Exact Match looks for values that match exactly, with no differences from one record to the next. Any unique identifying fields should use Exact Match.
Similar Match looks for values that may be close but with a one-character difference (like a typo, extra character, or missing character) and broadens the search. This search behaves like when Google shows results for a slightly different term, or says “Did you mean...” For example, if a Company Name of, “Acme” is found, it could include records with the Company Name values “Akme, acm, Acma,” etc., as a match.
Be careful if using Similar Match, as the looser criteria can incorrectly identify non-duplicates as duplicates. It is best to only use Similar Match with very open and generic fields, and only after trying everything else.
*Note that the ID field can only be Exact Match, never Similar Match.
Specify parts of a field value to ignore, such as specific text, whitespace, or characters. These won’t be considered as part of the matching process.
- Ignore Symbols and Whitespace when comparing phone numbers.
- Ignore HTTP, www, subdomain, or top-level domain (.com vs co.uk) when comparing websites or email domains is a great way to catch more advanced duplicates.
- Insycle comes preloaded with terms to ignore. If you select Common Terms, click the Terms button to view and edit this list on the Common Terms tab.
- If you select Text (substrings), click the Terms button, then the Ignored Text tab, and enter text to be ignored. Separate multiple substrings (or phrases) with a new line.
Note: If you’ve set up Ignored terms or strings, don’t forget to also enable them. Select the Ignored > Common Terms or Text (substrings) checkbox.
Define specific portions of the field value to compare.
Compare the entire value, the first word, any two words, just the first five letters, last nine characters, etc.
Each row in your matching fields setup is cumulative, so records must meet all of the criteria. For example, looking for records that have the same First Name, AND Last Name, AND Phone Number returns only results where all three values are the same.
To match against one field value OR another, you will need to run two different templates. For example, if you want to use fields like Phone Number OR Mobile Phone Number, you’ll run one template for Phone Number, then a second configured the same except with the Mobile Phone Number field.
The searched value must have four or more characters. For example, values of “Joe” will be ignored.
When setting up your Salesforce deduplication process for contact records, it's often useful to pick master records based on engagement. For example, the highest number of email clicks, or the most recent email opened. You can also use other statuses to pick a master record such as the furthest along in your sales lifecycle, or the most recently updated record.
For accounts, it's often useful to use associated records to determine the master record. For example, the highest number of associated contacts or deals.
When setting up your Hubspot deduplication, you'll use the Deduplication Master Record field to match the Salesforce master selection.
Priority Match: Looks through the master selection rules in order, one by one. As soon as a record meets one of the criteria, Insycle makes the master selection and skips the rest of the rules on the list. The vast majority of duplicate templates should use Priority Match.
Absolute Match: The master record must meet all of the listed rules in the Record tab in Step 4. If a record does not match every rule listed, no master record will be identified. Absolute Match is appropriate for less flexible master selection.
For example, if a company wanted to ensure the chosen master record is in their sales pipeline and already has a sales rep working the record, they can choose Absolute Match and set the Record rules:
- Lifecycle Stage is lead
- Contact Owner exists
Choosing Absolute Match can often result in no master record being identified since the record has to match every rule listed, so in most cases, you should select Priority Match.
Most of the time when Insycle can't find duplicates, it is due to your matching rules in Step 1. To better understand how to set up your rules, it is important to analyze the underlying data. A useful exercise can be to set up a simple filter to look for exact matches of Website, or Company Domain Name.
When you click the Find button, the results can show you a broad overview of what duplicates are potentially in your database, and what fields might be useful to include in your Find setup.
To get more information, click the gear button on the right side of the Step 2 header. Here, you can add any field in your database as a column to the Review Duplicates list to better understand the data inside these records.
If the Result column of the CSV displays an error, read the error text for help figuring out how to resolve the problem.
The most common error is:
Cannot determine master record because multiple records (#) satisfy the master selection rules. In ‘Master Selection’, change/add/reorder the rules such that only one record satisfies them (if cannot determine master based on field values, use ‘ID is lowest’ as the last rule).
This means that based on all the rules, Insycle could not figure out which record in the duplicate group would be the master. None of the records meet more of the rules than others.
There are a couple of things you can try to resolve this:
- Under Step 4, experiment with reordering or adding additional fields that are likely to have unique values.
- In the Step 4 heading, check to ensure that you have Priority Match selected and not Absolute Match.
With Priority Match, your master record only has to match one rule. Using Absolute Match, your master record would have to meet all of the rule criteria. The majority of the time it is best to select Priority Match.
If Priority Match was used, then none of the records in the duplicate group meet any of the criteria on the list more than the others. In this case, you'll need to experiment, reordering or adding additional rules for fields likely to have unique values.
Tips for Bulk Merging Duplicates
- Begin with easy-to-find duplicates. Iterate through fields and rules you know will surface duplicates. Don’t expect to resolve all your duplicates by setting up and running this process once. You will need to run this process multiple times for different fields or nuanced variations.
- Each time you get a Merge Duplicates process to run the way you want in your database, save it as a template. When you have a solid set of templates that reliably resolve most of your dupes, you can put them together as a Recipe that can run on a regular, automated schedule.
- You may also need to look for edge cases that fall outside your standard rules. These may be templates you run manually so you can make adjustments based on what you find.
- Do some experimentation. Use the Preview mode and CSV report to analyze patterns in the duplicates. You may learn what is causing the duplicates and learn how to avoid having them in the first place.
Frequently Asked Questions
Yes. HubSpot doesn't let you merge companies when the sync is active. To learn more, see the article Deduplicate HubSpot Companies and Salesforce Accounts.
No. You can deduplicate either HubSpot or Salesforce records first—the "Deduplication Master Record" field will be populated automatically.
Yes. The Deduplication Master Record field is a key requirement for deduplicating across HubSpot and Salesforce without breaking the sync. Keeping the master records consistently labeled across both platforms is how you are able to keep the sync active.
No. While Insycle lets you merge records while the sync is active, within the HubSpot CRM you cannot.
Related Help Articles
- Deduplicate HubSpot Contacts, Companies, and Deals in Bulk
- HubSpot Merge Duplicates Overview
- Salesforce Merge Duplicates Overview
- Deduplicate HubSpot Companies and Salesforce Accounts
- Customize Bulk Deduplication Using Exclusions and Pre-Defined Masters
- Deduplicate Across Salesforce Leads and Contacts
- Deduplication Best Practices
Related Blog Articles
- How to Merge Duplicates in HubSpot and Salesforce and Keep them Syncing
- How Insycle Solves Common Problems with HubSpot and Salesforce Integration
- Salesforce Duplicate Management: How to Automate Salesforce Deduplication
- Data Duplication and HubSpot: Dealing With Duplicates and the Impact They Have on Your Business