HubSpot Merge Duplicates Module Overview

Duplicate data in HubSpot poses serious problems for companies of any size.

Duplicate records inhibit your marketing team from effectively segmenting and personalizing your communications. Sales teams step on each other's toes and lack vital context in conversations. Support teams miss important information, and analysis and reporting are skewed.

Insycle's Merge Duplicates module helps you automatically detect redundant contacts, companies, deals, custom objects, or other object types, giving you control over how records are merged, and what field data is retained.

Key Use Cases

How It Works

The Merge Duplicates module makes it easy to identify duplicates and merge them in bulk.

Powerful matching options look at fields you specify to detect and group redundancies.

With duplicates identified, you set rules for determining the master record the duplicates will merge into—such as the first record created, record with the most email opens, last interacted with, or any other attribute. You can configure data retention rules that copy the most relevant field values into the permanent master record.

These configurations can be saved, automated, and scheduled to run at regular intervals, putting your duplicate cleanup process on autopilot.

Supported HubSpot Object Types

Insycle supports the following HubSpot object types:

  • Companies
  • Contacts
  • Deals
  • Leads
  • Line Items
  • Orders
  • Tickets
  • Custom Objects

You can select the object type you would like to work with in the top menu of the module.

merge-duplicates-hubspot-select-record-type.png

Step-by-Step Instructions

Step 1: Configure Rules to Identify Duplicates

In the Data Management > Merge Duplicates module, pick the record type, and explore the default templates for an existing solution similar to what you need.

Each row in Step 1 is for a field you want to look at for duplicates, along with some parameters on what to look for. You want to choose fields that, in combination, give a high degree of certainty that the matched records are duplicate records. See the Deduplicate HubSpot Contacts, Companies, and Deals in Bulk article for more details.

merge-duplicates-step-1-first-last-edomain.png

If you'd like to look at the data in two different fields (that contain similar data) as if it were one, you can set up Related Fields under the Advanced tab. 

step-1-advanced-tab-related-email.png

The Conditions tab provides rules that one or more of the records in a duplicate group must meet. These options let you choose fields that are required, can be empty, or specify values that must be included.

merge-duplicates-hubspot-contacts-step-1-conditions-all-6.png

Step 2: Analyze the Identified Duplicates

When two or more records represent the same entity (person, company, or other) based on your matching rules, they are clustered together into duplicate groups. Each duplicate group shows the total number of records that were identified as duplicates based on your settings. For example, if you had four records for the same person, it would count as one duplicate group with four duplicate records.

dup2.png

Step 3: Choose Whether to Merge in Bulk or Manually

The most efficient and sustainable way to merge duplicates is in Bulk mode. This allows you to set rules for determining the master record automatically across all records in your database. You'll be able to use saved templates and recipes to repeat the process on a regular basis. 

In Manual mode, you have complete control over which records are merged together by selecting them from the Record Viewer. Manual mode should be reserved only for cases where you need a careful, controlled process. Learn more about merging duplicates in Manual mode.

dup3.png

Step 4: Set Rules for Master Record Selection and Data Retention

After selecting Bulk mode in Step 3, you need to define how all of the matching duplicate groups should be merged at scale in Step 4.

Configure Rules to Automatically Select the Master Record

First, select the matching method—Priority Match, or Absolute Match. Most de-duplication operations should use Priority Match. Learn more about these options in the Deduplicate HubSpot Contacts, Companies, and Deals in Bulk article.

step-4-priority-match-no-arrow-2023-06-01.png

On the Records tab of Step 4, you define how the duplicate groups should be merged at scale by creating rules that tell Insycle how to select the record from each group to become the master. The master is the record that will remain after the merge.

For example, if you had four records representing the same contact, they would all be merged into one master record. The other three records would not exist anymore.

step-4-record-mktg-eml-high-low-w-blank-rule.png

Configure Rules That Determine Values to Keep

Duplicates may be exact match versions of another record, but often there is only partial data overlap between them. When data is split between two records, both may contain unique and important information you want to keep.

Under Step 4, click the Fields tab. For each field you want to control the data retention for, you need to tell Insycle which record the data should be taken from. This is merged into the master. Any data that is not in the master or not copied to the master is removed.

merge-duplicates-hubspot-contacts-step-4-field-rules-9.png

Learn more about configuring data retention and master record selection.

Preview Report and Update CRM

Preview Merged Changes in CSV Report

Now, with the filters and master record set up, you can preview the changes you are making to your data in a CSV. That way, you can check to ensure your deduplication configuration is working as expected before those changes are pushed to your live database.

Under Step 5, click the Review button and select Preview mode.

dup7.png

On the Notify tab, select recipients and add context to the report email.

On the When tab, click the Run Now tab, and select which records to apply the change to (you could do All, but if you have a large number of records, you may just want to do a chunk for your preview), then click the Run Now button.

dup8.png

Insycle will generate a preview CSV and send it to your email. Open the CSV file from your email in a spreadsheet application.

merge-duplicates-hubspot-contacts-csv.png

The Duplicate Group ID indicates which records will be merged together. 

The Status column indicates:

  • Duplicate – The record is part of a duplicate group.
  • Master The master record chosen for the duplicate group based on default behavior and your Record rules. Review the selections in this row to determine whether the appropriate records are being chosen.
  • Master (After) – This appears only if at least one or more fields have been specified in Step 4 on the Fields tab. For each duplicate group, the Master (After) row shows the values the final record will contain based on your Field rules and the default behavior. 
  • Error – If Insycle is not able to determine which record would be the master, an error message will appear here. See the Troubleshooting section below for more details.

If everything looks good, return to Insycle and move forward with applying the changes.

Apply Changes to Your HubSpot Records

When you're satisfied with the results in your preview, you can apply the merge changes to HubSpot.

Under Step 5, click the Review button again, and this time select Update mode.

On the When tab, you should use Run Now the first time you apply these changes to HubSpot. If you have a large number of records, you may want to do a smaller batch to review the results in HubSpot.

merge-duplicates-step-5-review-update-run-now-selected.png

Save Template and Set Up Automation

After you've seen the results in HubSpot and you are satisfied with how the operation runs, you can save your configuration as a template and set up automation so this merge operation runs on a set schedule. If you have several templates you'd like to run together automatically, you can create a Recipe and integrate it into HubSpot Workflows.

Return to the Template menu at the top of the page and click Copy to save your configurations as a new version of the template you started with. Then click the pencil to edit your new template name.

save-template-copy-and-rename.png

Under Step 5, click the Review button, and select Update mode.

On the Notify tab, select the send option appropriate for your automation: Always send, Send when errors, or Do not email.

Add any additional recipients who should receive the CSV (hitting Enter after each address) and provide context in the message subject or body.

merge-duplicates-step-5-update-notify-tab-always-send.png

On the When tab, select Automate, and configure the frequency you'd like the template to run. When finished, click Schedule.

merge-duplicates-step-5-review-update-automate-daily.png

You can view all your scheduled automations at any time on the Operations > Automations page.

Learn More:

Create a Recipe and Integrate with HubSpot Workflows

When you have a solid set of templates that reliably merge your records, you can put them together into a longer, ordered sequence as a Recipe. Then, you can schedule that Recipe to run on a consistent, set schedule. Your templates will run one after another in the order that you set.

recipe-merge-duplicate-contacts.png

To add your Recipe to a HubSpot Workflow, it must be automated and set to Execute as HubSpot Workflow Action.

Learn more about integrating Insycle Recipes with HubSpot workflows.

recipe-review-update-automate-hubspot-workflow.png

Audit Trail and History

With the Activity Tracker, you have a complete audit trail and history of changes made through Insycle, including processes run in Preview mode or data syncs. At any time, you can download a CSV report that shows all the changes made in a given operation run.

Navigate to Operations > Activity Tracker, search by module, app, or template name, then click the Run ID for the operation.

activity-tracker-merge-duplicate-operation-run-id-w-arrow.png

Advanced Use Cases

Merge Duplicate Salesforce Accounts and the Corresponding HubSpot Companies When Sync is Active

Having your HubSpot and Salesforce CRMs set up to sync can make cleaning up duplicates tricky. You need to determine the appropriate “master record” to use across both CRMs and consider the merging process. Often, your settings in each platform impact how the merge takes place.

When you deduplicate accounts in Salesforce, the master is kept in sync with the original HubSpot record, indicated by a Salesforce Account ID value. However, the deduplication only takes place on Salesforce, leaving duplicate companies in HubSpot. Since HubSpot doesn't allow you to deduplicate companies while the sync is active from within the HubSpot app, you need another option.

Insycle allows you to merge duplicate HubSpot companies and Salesforce accounts while keeping things simple and your sync intact.

To learn more, see Deduplicate HubSpot Companies and Salesforce Accounts.

Deduplicate HubSpot and Salesforce Simultaneously

Your sales team is using both HubSpot and Salesforce and is running into problems with duplicate records in one CRM or the other, and sometimes both.

You have a data sync set up between the two systems but don't know how to deduplicate effectively to ensure the cleanup effort is consistent across CRMs. In addition, HubSpot doesn't allow you to deduplicate companies while the sync is active from within the HubSpot app.

With the Merge Duplicates module, you can flexibly merge duplicate people and companies in bulk and automatically (including through workflow automation right when visitors fill out a form) even when the HubSpot-Salesforce sync is active, and keep the master records syncing after the merge. You can also control the merge process by defining rules for picking the master record and master field values (for example, retain the owner from the contact that was created first).

To learn more, see Deduplicate HubSpot and Salesforce While Keeping the Sync Active.

Granular Control for Picking Duplicate Records

For situations where there are no common rules you can apply to identify duplicates for all or some of the records, you may need more granular control for picking records to include or exclude from the process. In these cases, you can use CSV files to customize your bulk merging, designate master records, and exclude records from deduplication. Then you can import the CSV from the Magical Import, and use the Merge Duplicates module for complete control over the final merge operation. Learn how to customize merging Duplicates in bulk using a CSV.

Merging Child or Parent Companies While Retaining Associations

When deduplicating child/parent companies in HubSpot, Insycle is able to detect even the most complex company hierarchy associations, ensuring that the correct child company master records are associated with the correct parent company master records after the companies are merged.

HubSpot Merge Logic

How Data is Consolidated When Bulk Merging HubSpot Duplicates in Insycle

Contacts

  • Email: The email address from the master record becomes the primary, and the duplicate email addresses are added as additional email addresses.
  • Activities (notes, emails, tasks, etc.): Reassigned from the duplicates to the master.
  • Deals: Reassigned from the duplicates to the master.
  • Associations: Reassigned from the duplicates to the master.
  • Attachments: Reassigned from the duplicates to the master. (Note that there may be a short delay before the attachment appears in the merged record.)
  • Fields: Use the Field tab under Step 4: Master Selection to determine what data is retained in the master record on a field-by-field basis. By default, the most recently updated value becomes the present value; all other values are available in the history. See HubSpot's merge contacts help article to learn about HubSpot's default contact merging behavior.

Companies, Deals, Tickets, Custom Objects, and Line Items

  • Contacts: Reassigned from the duplicates to the master.
  • Deals: Reassigned from the duplicates to the master.
  • Associations: Reassigned from the duplicates to the master.
  • Activities (notes, emails, tasks, etc): Reassigned from the duplicates to the master.
  • Domains (applies only to Companies): Copied from the duplicates into the master and appended as secondary domains to avoid future duplicates with the same domain.
  • Attachments: Reassigned from the duplicates to the master. (Note that there may be a short delay before the attachment appears in the merged record.)
  • Fields: Use the Field tab under Step 4 to determine what data is retained in the master record on a field-by-field basis. By default, the value is retained from the master. When a value is empty in the master, it picks a non-empty value from the most recently updated duplicate.

merge-duplicates-hubspot-contacts-step-4-field-rules.png

When in doubt about conflicting field values, include those fields in the CSV report by adding them to the Record tab in Step 4.

Adding fields to master selection rules

Troubleshooting

If you're not seeing the results you expect when merging duplicates, consider these issues:

Not all identified duplicates are merging into the master

You have duplicate records that have been identified by Insycle, but not all of them are merging into the master. Check to see how many duplicates are in the affected duplicate groups. If you have duplicate groups that contain more than five records, you may want to change the value in Skip duplicate groups with more than 5 records per group under Step 3 to make sure you can get them all.

merge-duplicates-step-3-bulk.png

This setting is intended to protect against the accidental merging of non-duplicate records if the filter in Step 1 is too broad.
"Change rules in Step 4 'Master Selection'" Message in CSV

If the Message column of the CSV report displays this text:

Change rules in Step 4 'Master Selection'. Failed to pick master record because multiple records (X) meet the selection criteria. In 'Master Selection', change, add, or reorder the rules such that only one record matches (if cannot determine master based on field values, use 'Record ID is lowest' as the last rule).

merge-duplicates-salesforce-accounts-csv-w-error.png

This means that based on all the rules, Insycle could not figure out which record in the duplicate group would be the master. None of the records meet more of the rules than others.

There are a few things you can try to resolve this:

  1. Under Step 4, on the Record tab, experiment with reordering or adding additional fields that are likely to have unique values.
  2. In the Step 4 heading, check to ensure that you have Priority Match selected and not Absolute Match.

    step-4-priority-match-w-arrow-606px.png
    With Priority Match, your master record only has to match one rule. Using Absolute Match, your master record would have to meet all of the rule criteria. The majority of the time, it is best to select Priority Match.

    If Priority Match was used, then none of the records in the duplicate group meet any of the criteria on the list more than the others. In this case, you'll need to experiment with the Record tab, reordering or adding additional rules for fields likely to have unique values.

  3. As a last resort, you can add a rule on the Record tab of Step 4 that says Record ID is lowest, or Create Date is earliest.
    merge-duplicates-hubspot-contacts-step-4-record-tab-last-resort-rules.png
Non-duplicate records are being merged together

There are a couple of things to look at that may be misidentifying records as duplicates.

First, you may need a better unique identifier. Under Step 1, if you only use fields that could correctly contain the same values in multiple records, these aren't unique identifiers. In this case, you are likely to identify unrelated records as duplicates and may accidentally merge them.

merge-duplicates-intercom-step-1-name-only.png

Unique identifiers are data that is unlikely to be shared by any other record unless it represents the same underlying entity. Fields that are commonly used in deduplication include phone numbers, email, mailing addresses, or ID numbers.

Second, this may indicate the Comparison Rule under Step 1 is too broad. Try using the Exact Match comparison rule instead of Similar Match. Similar Match looks for values that may be close but with a one-character difference (maybe a typo), broadening the search. 

Remember always to run your deduplication in Preview Mode to confirm things are working as expected before running them in Update Mode and applying the changes to your HubSpot records.

Insycle isn't finding any duplicates

Most of the time, when Insycle can't find duplicates, it is due to your matching rules in Step 1. It is important to analyze the underlying data to better understand how to set up your rules. A useful exercise can be to set up your matching filters to look for exact matches of just First Name and Last Name

step-1-fname-Lname-only.png

When you click Find, these rules can show you a broad overview of what duplicates are potentially in your database and what fields might be useful to include in your matching fields. These settings are just for discovery and should not be used for a final merge operation; many people can have the same first and last names and are not duplicates. 

To get further context, on Step 2, click the layout gear button on the right side of the title bar. Here, you can add any field in your database as a column to the duplicate group review to better understand the data inside these records. 

dup15.png

It's taking a long time for Insycle to find duplicates

It can take a while for Insycle to find and match duplicates if the fields being used to identify them have very long values. The longer the values, the longer it takes Insycle to process the data and generate the results. This might come up when looking for matches based on long ID numbers, LinkedIn bio links, or other URLs with long strings attached (ex, https://www.linkedin.com/in/svadin%C3%ADr-n%C4%9Bmec-1234b31a3/).

You can speed this up by limiting how much of the value Insycle looks at.

If the beginning or ending portion of the values are all unique, you can limit the comparison to the first or last several characters using the Match Parts parameter under Step 1

merge-duplicates-linkedin-bio-step-1-match-parts-last-9-chars.png

merge-duplicates-linkedin-bio-step-2-last-9-chars.png

Or use the Ignore Text (Substrings) parameter, then click the Terms button.

merge-duplicates-linkedin-bio-step-1-ignored-text-terms-button.png

On the Ignored Text tab of the popup, add the common portion of the URL or text string.

merge-duplicates-linkedin-bio-step-1-ignored-text-popup.png

For more help troubleshooting issues with Insycle, refer to our Troubleshooting Issues article.

Frequently Asked Questions

Can I find duplicates when one field is empty?

Yes. When using two or more fields to identify duplicates, records can still be considered matches even if one of the field values is blank. You just need to specify which field(s) allow a blank value.

Under Step 1, configure your matching rules in the Simple tab, then click the Conditions tab.

step-1-allow-empty_1.png

All the matching fields you included will automatically appear with the Value Required in All Records condition selected. Change the condition to Empty Allowed in Any Record to allow empty values for certain fields. You can also use the At Least One Record with Non-Empty condition to help you determine which is the master record. Make sure at least one field remains required and is a reliable unique identifier to ensure the records are really duplicates.

step-1-conditions-empty-not-empty.png

For example, on the Simple tab, you may have the matching fields: First Name, Last Name, and Phone Number. However, the Phone Number field may be empty on some of your records. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.

step-1-allow-empty-review.png

Can I match duplicates using two different fields?

Yes. This can be done, for example, if you want to look at both the Phone Number field values and Mobile Phone Number field values as a single pool of values to compare between records and identify duplicates.

Using the Related Fields feature, you can use two different fields (that contain similar data) as matching fields to catch more duplicates. You can set up Related Fields in the Advanced tab.

step-1_related-field.png

How do I ensure that I am not merging non-duplicate records together?

There are two ways to make sure that the records that you are merging are indeed duplicate records.

First, always run your deduplication templates in Preview Mode before running them in Update Mode. This produces a CSV that shows you how your records would have been merged. Then you can ensure that your Merge Duplicates template is working as expected and not merging non-duplicate records together.

Additionally, to ensure a smooth merge process, consider narrowing down the matching settings in Step 1. Try the Exact Match Comparison Rule instead of Similar Match. Then make sure that you are using actual uniquely identifying fields—first name, last name, email, and phone number are popular choices. The more tightly defined your filter is, the less likely you are to merge non-duplicate records.

Insycle is having trouble determining a master record. What could be causing this issue?

If the Message column of the CSV report displays this text:

Change rules in Step 4 'Master Selection'. Failed to pick master record because multiple records (X) meet the selection criteria. In 'Master Selection', change, add, or reorder the rules such that only one record matches (if cannot determine master based on field values, use 'Record ID is lowest' as the last rule).

merge-duplicates-salesforce-accounts-csv-w-error.png

This means that based on all the rules, Insycle could not figure out which record in the duplicate group would be the master. None of the records meet more of the rules than others.

There are a few things you can try to resolve this:

  1. Under Step 4, on the Record tab, experiment with reordering or adding additional fields that are likely to have unique values.
  2. In the Step 4 heading, check to ensure that you have Priority Match selected and not Absolute Match.

    step-4-priority-match-w-arrow-606px.png
    With Priority Match, your master record only has to match one rule. Using Absolute Match, your master record would have to meet all of the rule criteria. The majority of the time, it is best to select Priority Match.

    If Priority Match was used, then none of the records in the duplicate group meet any of the criteria on the list more than the others. In this case, you'll need to experiment with the Record tab, reordering or adding additional rules for fields likely to have unique values.

  3. As a last resort, you can add a rule on the Record tab of Step 4 that says Record ID is lowest, or Create Date is earliest.
    merge-duplicates-hubspot-contacts-step-4-record-tab-last-resort-rules.png
My merged records are not being enrolled in a HubSpot Workflow. Is this intentional? How can I change this?

By default, when two contacts are merged in HubSpot, Workflows will not enroll merged contacts. However, merged contacts can enroll in the future if re-enrollment is enabled and they meet the enrollment triggers. 

In contact-based workflows, you can manage the enrollment of merged contacts, remove contacts that no longer meet enrollment criteria, and prevent enrollment of contacts in specific lists. To learn more, see HubSpot's workflow documentation.

I already have a list of duplicates. Can Insycle bulk merge them?

Yes. You can merge specific records using a CSV file containing the records you want to combine. Here's how:

  • Prepare a CSV file with columns for the record IDs and a "Merge Master" column. In the "Merge Master" column, mark which record should be kept after merging.
  • Create a custom field called "Merge Master" in your CRM.
  • Use the Magical Import module to import your CSV file into the CRM, updating the "Merge Master" field for the relevant records.
  • Go to the Merge Duplicates module and set up a filter to select records based on the "Merge Master" field.

Learn more about customizing bulk deduplication from a CSV.

Can I select which data is retained in my master record on a field-by-field basis?

Yes. Insycle allows you to select which field data is retained in the master record using the Fields tab under Step 4. See the Deduplicate HubSpot Contacts, Companies, and Deals in Bulk article for more details.

step-4-field-selection.png

I need to exclude some records from deduplication. Can I do that?

Yes. You can exclude records from deduplication by creating a CSV with a "Deduplication Exclude" field.

First, you'll export a Preview CSV from the Merge Duplicates module, add an exclude column, and specify which records should be excluded from the merge process. Next, create a custom field in your CRM to facilitate the merging. Use the Magical Import module to import the edited CSV file into the CRM, populating the new custom field. Finally, utilize this custom field to merge the remaining duplicate records in the Merge Duplicates module.

Learn how to customize bulk deduplication using exclusions.

Some of my duplicates have attachments. Will these be preserved?

Yes, if your HubSpot objects have attachments, they will be merged into the master record. Note that there may be a short delay before the attachment appears in the merged record.

I used the “From master record (even empty)” retention rule. Why does HubSpot say Insycle deleted it?

When merging HubSpot contact records using the “From master record (even empty)” data retention rule, the property history in HubSpot shows that Insycle set the value to “empty.” This is a nuance of how HubSpot manages the history of empty values. You can verify that the master record value before the merge was indeed empty by reviewing the Activity Tracker report in Insycle.

My team needs to review and approve the master. Can I accommodate that with Insycle?

Yes, there are several ways to share details and get approval before merging duplicates.

You can manually approve master records and mark them in a CSV, then use Insycle to bulk deduplicate down to those master records. See the Customize Bulk Deduplication Using Exclusions and Pre-Defined Masters article to learn more.

You can also run the Merge Duplicates module in Preview Mode and then deliver the preview CSV that Insycle generates. The CSV report includes your entire merge operation down to individual duplicate groups but does not update your live data. Then your team can approve the merge based on this report before running Merge Duplicates in Update Mode.

Additionally, team members can review duplicates and manually select the master for each record under Step 4. Review the Manually Merge Duplicates article for more detail.

step-4-manual-select.png

Do my matching field values have to be exactly the same?

No, your field data does not need to match exactly. The Similar Match found in Step 1 looks for values that may be close but with a one-character difference (maybe a typo) and broadens the search.

step-1-email-only.png

This search behaves like when Google shows results for a slightly different term, or says “Did you mean...” For example, if an Email of “huey@coahulldu.co” is found, it could include records with the values “hueyy@coahulldu.co" or "hue.y@coahulldu.co” as a match.

step-2-group-w-similar-match.png

Do pay close attention when using Similar Match as the looser criteria can incorrectly identify non-duplicates as duplicates. 

Review the Similar Matching best practices for more detail.

Can Insycle help me deduplicate while syncing with Salesforce?

Yes, Insycle solves numerous deduplication-related issues when Salesforce and HubSpot are syncing. See the Deduplicate Salesforce and HubSpot While Keeping the Sync Active article to learn more.

Why can I only process 50 duplicate groups at a time?

Insycle shows 50 records on the module screen as a preview. This isn't the entire list of records. To see everything, include All records when you set up the Preview CSV report.

Insycle can process thousands of duplicate groups in one operation. Potentially, you could deduplicate your entire database in one operation. 

How many duplicates can I merge into one master record?

You can merge up to 100 duplicate records into a single master record. 

If you have duplicate groups that contain more than five records, you may want to change the value in Skip duplicate groups with more than 5 records per group under Step 3 to make sure you can get them all.

merge-duplicates-step-3-bulk.png

This is a precaution to ensure that if you use a duplicate matching filter that is too broad in Step 1, you do not accidentally merge many non-duplicate records together. If you are going to set this number at a high level, it is a good idea to run Preview Mode first to ensure your deduplication template is operating as you intend.

Are there any limits on the number of records that can be identified and merged with my paid subscription?

All paid plans include unlimited usage, users, and operations. During the free trial, there is a cap of 500 records updated, cleansed, or merged. See the pricing page for more details. 

Additional Resources

Related Help Articles

Related Blog Posts