Salesforce Merge Duplicates Overview

 

Duplicate data in Salesforce poses serious problems for companies of any size.

Duplicate records inhibit your marketing team from effectively segmenting and personalizing your communications. Sales teams step on each other's toes and lack vital context in conversations. Support teams miss important information, and analysis and reporting are skewed.

Insycle helps you to merge duplicate leads, contacts, accounts, opportunities, and custom objects—flexibly and powerfully with the Merge Duplicates module.

Insycle uses the underlying Salesforce APEX merge API, which will need Insycle's Salesforce AppExchange app to be installed.

Use Cases

Salesforce Record Types Supported

Insycle supports the following Salesforce record types:

  • Contacts
  • Accounts
  • Leads
  • Opportunities
  • Custom Record Types

You can select the record type you would like to import at the top of the module screen.

How It Works

Insycle analyzes your database and identifies duplicates with flexible matching rules, using any field in your database, to help you identify and merge more duplicates.

Once Insycle identifies the duplicate records, you set rules for determining the master record that other duplicates will be merged into—such as the first record created, the record with the most email opens, or any other attribute that would be relevant. You can also set merging logic on a field-by-field basis.

You can merge duplicates in bulk, and Insycle provides a complete report of what was identified as a duplicate, what was merged, and what the outcome was in your master record.

Deduplicate Across Leads and Contacts

With the Merge Duplicates module, you can deduplicate across both leads and contacts in Salesforce.

Open the Merge Duplicates module, and pick the "Contacts" record type.

contacts

Then, in Step 1, check the "Include Leads" checkbox.

step-1-salesforce-include-leads.png

Deduplicate Opportunities and Other Standard or Custom Objects

When merging objects that do not have a native merge API, Insycle performs a synthetic merge. Synthetic merge is supported for opportunities, and any other standard or custom object.

  • Fields (for example, phone number): Data from the master record is kept. When a field value is empty in the master it picks a non-empty value from the most recently updated duplicate automatically. When in doubt about conflicting field values, include those fields in the CSV report by adding them to the Master Selection section. Their values will also show on the audit trail.
  • Relationships: Insycle inspects the schema metadata for relationships to the duplicate records, and reparent those relationships to point to the master record instead of the duplicate. For example, that's how it would re-link/re-parent "Notes" in the duplicate records into the master record.

Deduplicate Salesforce Accounts and HubSpot Companies

Fixing duplicate HubSpot companies and Salesforce accounts while syncing has several nuanced issues that need to be accounted for. There are specific data issues that can break the sync and require you to merge records manually. You also need to determine the appropriate “master record” to use across both HubSpot and Salesforce.

Then you have to consider the merging process. If two records are merged on Salesforce, are they merged on HubSpot as well?

Insycle allows you to merge duplicate Hubspot companies and Salesforce accounts while keeping your sync intact, simply.

To learn more, see Deduplicate HubSpot Companies and Salesforce Accounts

Salesforce Merge Logic

Insycle uses the underlying Salesforce APEX merge API, for more information about this please see Salesforce's Apex Developer Guide.

In addition to the default merge logic, you have two ways of controlling how your records are merged:

  1. Master record selection – Create a series of rules to automatically select the master record for each of the duplicate groups in the Record tab. The first record that is the only one matching a rule will be selected as the master. For example, if only one record in the group has an "Account Owner" that could be selected as the master. Or, the only record where the, "Is Email Bounced" value is "False."
  2. Data retention – Values from different records in the duplicate group can be saved to the master record based on rules that you set in the Fields tab. For example, you could choose to keep your "First Name" and "Last Name" fields from the earliest created record in the duplicate group, while keeping the "Owner" from the most recently updated record.
step-4-field-select-salesforce.png

In the event that you don't specify merge logic for a specific field—when a field value is empty in the master record, Insycle picks a non-empty value from the most recently updated duplicate automatically. When in doubt about conflicting field values, include those fields in the CSV report by adding them to the Record tab.

Merging Contacts Related to Multiple Accounts

When you have enabled Salesforce's setting for relating a contact to multiple accounts, Insycle will automatically reassign the relationships to the master record and remove any redundant ones. This applies to both direct (primary) and indirect (non-primary) relationships.

Customized Merge Logic

For situations where you need more granular customization for picking duplicate records to include - or exclude - from the deduplication process, or for picking the master record, and there are no common rules you can apply for all or some of the records, you can customize bulk deduplication using exclusions and pre-defined masters via a CSV file.

Automation

You can schedule your Salesforce deduplication templates to run on an automated, set schedule.

You can automatically schedule your template directly in the Merge Duplicates module. You do this by clicking the Review button at the bottom of the module page:

Then, you go through a three-step process to run the operation. In the third step, you can choose the "Automate" tab, and schedule your template to run on a set schedule.

merge duplicates salesforce automation

You can also schedule Salesforce deduplication automation using Recipes. You can view all scheduled automations on the “Automations” page on your dashboard.

Learn More:

Preview Changes Before They Go Live

You can preview the changes that you are making to your data before those changes are pushed to your live database. That way, you can check to ensure your deduplication operation is working as expected.

Advanced How-Tos

Step 1: Setting Up the Fields
Field Name Comparison Rule Ignored Match Parts

Pick a field that you think has some duplicate values.

Running a very simple match operation like just First and Last Name can be helpful in giving you an idea of what you have, but it is too broad to use for reliable analysis and deduplication. There may be legitimate duplicate names–different people with the same first and last name. You need additional, unique criteria to narrow it down.

Choosing Unique Identifiers

Matching duplicates requires unique identifiers—data that is unlikely to be shared by any other record unless it is a duplicate. If you don't use unique identifiers, you are likely to identify unrelated records as duplicates and may accidentally merge them.

Many CRMs match first names, last names, and email addresses. If all of those match, or are similar, you can confidently determine that the record is a duplicate.

Other unique identifying fields that are commonly used in deduplication include:

    • Phone number
    • Mailing address
    • ID numbers

Each row in your matching fields setup is cumulative, so records must meet all of the criteria. For example, looking for records that have the same First Name, AND Last Name, AND Phone Number returns only results where all three values are the same.

To match against one field value OR another, you will need to run two different templates. For example, if you want to use fields like Phone Number OR Mobile Phone Number, you’ll run one template for Phone Number, then a second configured the same except with the Mobile Phone Number field.

The searched value must have four or more characters. For example, values of “Joe” will be ignored.

Step 1: Narrowing Down the Records with a Filter 

Use the filter to work with a segment or smaller pool of records. Then Insycle will only analyze the remaining records for duplicates. To add filters, click the Filter button, then choose the field to look at, select the condition, and set the value to look for. The filter is applied before the matching step runs. 

Step 1 filter button

You may want to use a filter if:

  • You know you only want to work with a subset of your data. In this case, there’s no need to run the operation on your whole database.
  • There are an overwhelming number of duplicate results. Add a filter to work with a reasonably sized subset while you work to get the configuration right. 
  • You want the operation to run faster. A refined segment can speed things up since there are fewer records to analyze.

Most of the options in the Field dropdown match the fields that are found in your CRM, and for Contact records, there are three additional options related to the Email value: 

  • Email Username: The portion of the email address before the “@.” For example, if the email address were “maria@acmewidgets.com,” the username value would be “maria.” 
  • Free Email Provider Domain: Choose True to filter out records where the email domain is Gmail, Hotmail, Yahoo, and about 10,000 other free email providers. This filter helps ensure these are real clients, or can determine which record is the legitimate one because it’s most likely customer companies aren't using free Gmail accounts (though a contact may have accidentally emailed us from it at some point). 
  • Email Top-Level Domain: The top-level domain (TLD) is everything that follows the final dot of a domain name. For example, in the domain name acmewidgets.com', '.com' is the TLD. Some other popular TLDs include '.org', '.uk', and '.edu'. 
Step 1: Matching Using Two Different Fields

Sometimes, you might want to match duplicates using data in two separate fields. For example, you might want to compare the Business Phone field to a Mobile Phone field to identify duplicates.

Using the Related Fields feature, you can use two different fields (that contain similar data) as matching fields to catch more duplicates.

You can set up Related Fields on the Advanced tab of Step 1.

step-1-advanced-tab-salesforce-related-phone.png

Common Examples of Related Field Matching

Matching Field Related Fields
Business Phone Mobile Phone, Other Phone
Email Domain Website, Company Domain
Mailing Address Other Address
Step 1: Allowing Empty Values When Matching

When using two or more fields to identify duplicates, records can still be considered matches even if one of the field values is blank. You just need to specify which field(s) allow a blank value.

Under Step 1, configure your matching rules in the Simple tab, then click the Conditions tab.

step-1-conditions-tab-arrow.png

All the matching fields you included will automatically appear with the Value Required in All Records condition selected. Change the condition to Empty Allowed in Any Record to allow empty values for certain fields. You can also use the At Least One Record with Non-Empty condition to help you determine which is the master record. Make sure at least one field remains required and is a reliable unique identifier to ensure the records are really duplicates.

step-1-conditions-empty-not-empty.png

For example, on the Simple tab, you may have the matching fields: First Name, Last Name, and Phone Number. But on some of your records, the Phone Number field may be empty. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.

step-2-group-w-empty-saleforce-contacts&leads.png

Step 1: Compare Number of Duplicates with and without Leads

Just as an experiment you could uncheck the Include Leads checkbox in Step 1. 

Number of duplicates with leads

You should notice a difference in the number of duplicates found.

Number of duplicates without leads

Step 4: Considerations When Picking a Master Record

For contacts, it's often useful to pick master records based on engagement. For example, the highest number of email clicks, or the most recent email opened. You can also use other statuses to pick a master record such as the furthest along in your sales lifecycle, or the most recently updated record. 

For companies, it's often useful to use associated records to determine the master record. For example, the highest number of associated contacts or deals. 

If you have a connected app, like Salesforce or an ERP system, pick the master record that is syncing with the other apps.

Step 4: Selecting Priority Match vs Absolute Match

step-4-priority-match-no-arrow-2023-06-01.png

Priority Match: Looks through the master selection rules in order, one by one. As soon as a record meets one of the criteria, Insycle makes the master selection and skips the rest of the rules on the list. The vast majority of duplicate templates should use Priority Match.

Absolute Match: The master record must meet all of the listed rules in the Record tab in Step 4. If a record does not match every rule listed, no master record will be identified. Absolute Match is appropriate for less flexible master selection.

For example, if a company wanted to ensure the chosen master record is in their sales pipeline and already has a sales rep working the record, they can choose Absolute Match and set the Record rules:

  • Lifecycle Stage is lead
  • Contact Owner exists

Choosing Absolute Match can often result in no master record being identified since the record has to match every rule listed, so in most cases, you should select Priority Match.

Step 4: Control What Field Data is Retained

The Merge Duplicates module allows you to control the values saved in the master record after the merge regardless of the default merge behavior. By adding each field you want to control the data retention for and selecting a Condition, you can tell Insycle where the data for the field should be taken from and how to handle it.

For example, if merging Salesforce accounts, you may want to save all of the Account IDs from records that are merged together and deleted. You can add a new custom field, “Merged Account IDs” to your CRM.

salesforce-field-merged-account-ids.png

Then in the Merge Duplicates module under the Fields tab of Step 4, add a rule to override the default merge behavior. Select the "Merged Account IDs" field, the "Collect non-master values from other field" criteria, and "Account ID" as the other field. 

step-4-fields-salesforce-accounts-2.png

You can use the Preview to see how this will preserve the Account IDs of all the duplicates in each duplicate group.

step-4-collect-all-values-CSV-salesforce.png

Step 4: Customizing Merge Logic

For situations where you need more granular customization for picking duplicate records to include—or exclude—from the deduplication process, you can customize bulk deduplication using exclusions and pre-defined masters via CSV file. Additionally, you can use this process when there are no common rules you can apply to choose the master record.

Troubleshooting

If you're not seeing the results you expect when merging duplicates, consider these issues:

Not all identified duplicates are merging into the master

You have duplicate records that have been identified by Insycle but not all of them are merging into the master. Check to see how many duplicates are in the affected duplicate groups. If you have duplicate groups that contain more than five records, you may want to change the value in Skip duplicate groups with more than 5 records per group to make sure you can get them all.

Step 3 Bulk option

This setting is intended to protect against the accidental merging of non-duplicate records if the filter in Step 1 is too broad.

Insycle was unable to determine the master record

If the Result column of the CSV report displays this error:

Cannot determine master record because multiple records (#) satisfy the master selection rules. In ‘Master Selection’, change/add/reorder the rules such that only one record satisfies them (if cannot determine master based on field values, use ‘ID is lowest’ as the last rule).

This error means that based on the master rules you set, Insycle could not figure out which would be the master.

Check Step 4 to ensure that you have Priority Match selected and not Absolute Match.

step-4-priority-match-w-arrow-2023-06-01.png

With Priority Match, the rules configured in the Records tab of Step 4 are processed in order and your master record only has to match one rule. Using Absolute Match, your master record would have to meet all of the rule criteria. The majority of the time it is best to select Priority Match.

If Priority Match was used, then none of the records meet any of the criteria on the list more than the others. In this case, you'll need to experiment with Step 4, reordering or adding additional rules for fields likely to have unique values.

Non-duplicate records are being merged together

There are a couple of things to look at that may be misidentifying records as duplicates.

First, you may need a better unique identifier. Under Step 1, if you only use fields that could correctly contain the same values in multiple records, these aren't unique identifiers. In this case, you are likely to identify unrelated records as duplicates and may accidentally merge them.

Unique identifiers are data that is unlikely to be shared by any other record unless it represents the same underlying entity. Fields that are commonly used in deduplication include phone numbers, email, mailing addresses, or ID numbers.

Second, this may indicate the Comparison Rule under Step 1 is too broad. Try using the Exact Match comparison rule instead of Similar Match. Similar Match looks for values that may be close but with a one-character difference (maybe a typo) which broadens the search. 

Remember, always run your deduplication in Preview Mode to confirm things are working as expected before running them in Update Mode and applying the changes to your CRM records.

Insycle isn't finding any duplicates

Most of the time when Insycle can't find duplicates, it is due to your matching rules in Step 1. To better understand how to set up your rules, it is important to analyze the underlying data. A useful exercise can be to set up your matching filters to look for exact matches of just First Name and Last Name. 

step-1-first-last-exact.png

When you click the Find button, these rules can show you a broad overview of what duplicates are potentially in your database, and what fields might be useful to include in your matching fields. These settings are just for discovery and should not be used for a final merge operation; many people can have the same first and last names and are not duplicates. 

To get further context, click the gear button on the right side of the Step 2: Review Duplicates pane. Here, you can add any field in your database as a column to the view to better understand the data inside of these records.

Step 2 columns layout gear

Resolving issues identified in CSV

If the Result column of the CSV displays an error, read the error text for help figuring out how to resolve the problem.

Error shown in CSV

The most common error is:

Cannot determine master record because multiple records (#) satisfy the master selection rules. In ‘Master Selection’, change/add/reorder the rules such that only one record satisfies them (if cannot determine master based on field values, use ‘ID is lowest’ as the last rule).

This means that based on all the rules, Insycle could not figure out which would be the master. None of the records meet more of the rules than others. In this case, you'll need to experiment with reordering or adding additional fields likely to have unique values under Step 4.

It's taking a long time for Insycle to find duplicates

It can take a while for Insycle to find and match duplicates if the fields being used to identify them have very long values. The longer the values, the longer it takes Insycle to process the data and generate the results. This might come up when looking for matches based on long ID numbers, LinkedIn bio links, or other URLs with long strings attached (ex, https://www.linkedin.com/in/svadin%C3%ADr-n%C4%9Bmec-1234b31a3/).

You can speed this up by limiting how much of the value Insycle looks at.

If the beginning or ending portion of the values are all unique, you can limit the comparison to the first or last several characters using the Match Parts parameter under Step 1

merge-duplicates-linkedin-bio-step-1-match-parts-last-9-chars.png

merge-duplicates-linkedin-bio-step-2-last-9-chars.png

Or use the Ignore Text (Substrings) parameter, then click the Terms button.

merge-duplicates-linkedin-bio-step-1-ignored-text-terms-button.png

On the Ignored Text tab of the popup, add the common portion of the URL or text string.

merge-duplicates-linkedin-bio-step-1-ignored-text-popup.png

For more help troubleshooting issues with Insycle, refer to our Troubleshooting Issues article.

Frequently Asked Questions

How can I find duplicates when one field is empty?

When using two or more fields to identify duplicates, records can still be considered matches even if one of the field values is blank. You just need to specify which field(s) allow a blank value.

Under Step 1, configure your matching rules in the Simple tab, then click the Conditions tab.

allow-empty_2.png

All the matching fields you included will automatically appear with the Value Required in All Records condition selected. Change the condition to Empty Allowed in Any Record to allow empty values for certain fields. You can also use the At Least One Record with Non-Empty condition to help you determine which is the master record. Make sure at least one field remains required and is a reliable unique identifier to ensure the records are really duplicates.

step-1-conditions-empty-not-empty.png

For example, on the Simple tab, you may have the matching fields: First Name, Last Name, and Phone Number. But on some of your records, the Phone Number field may be empty. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.

allow-empty-review.png

Can I match duplicates using two different fields?

Yes. This can be done, for example, if you want to look at both the Phone Number field values and Mobile Phone Number field values as a single pool of values to compare between records and identify duplicates.

Using the Related Fields feature, you can use two different fields (that contain similar data) as matching fields to catch more duplicates. You can set up Related Fields in the Advanced tab.

bulk-merge_2.png

How do I ensure that I am not merging non-duplicate records together?

Currently, there are two ways to make sure that the records that you are merging are indeed duplicate records.

First, always run your deduplication templates in Preview Mode before running them in Update Mode. This produces a CSV that shows you how your records would have been merged. Then you can ensure that your Merge Duplicates template is working as expected and not merging non-duplicate records together.

Additionally, you can reduce the risk when merging duplicates by narrowing your duplicate matching settings in Step 1. Try the Exact Match Comparison Rule instead of Similar Match. Then make sure that you are using actual uniquely identifying fields—first name, last name, email, and phone number are popular choices. The more tightly defined your filter is, the less likely you are to merge non-duplicate records.

Insycle is having trouble determining a master record. What could be causing this issue?

If the Result column of the CSV report displays this error:

Cannot determine master record because multiple records (#) satisfy the master selection rules. In ‘Master Selection’, change/add/reorder the rules such that only one record satisfies them (if cannot determine master based on field values, use ‘ID is lowest’ as the last rule).

This error means that based on the master rules you set, Insycle could not figure out which would be the master.

Check Step 4 to ensure that you have Priority Match selected and not Absolute Match.

priority-or-absolute.png

With Priority Match, the rules configured in the Records tab of Step 4 are processed in order and your master record only has to match one rule. Using Absolute Match, your master record would have to meet all of the rule criteria. The majority of the time it is best to select Priority Match.

If Priority Match was used, then none of the records meet any of the criteria on the list more than the others. In this case, you'll need to experiment with Step 4, reordering or adding additional rules for fields likely to have unique values.

I already have a list of duplicates, can Insycle bulk merge them?

Yes. You can use a customized list of duplicates and use the Magical Import module to tag duplicates in your Salesforce CRM, then use the Merge Duplicates module to deduplicate in bulk. Include ID numbers from Salesforce in your CSV.

Can I select which data is retained in my master record on a field-by-field basis?

Yes, Insycle allows you to control data retention in the master record using the Fields tab under Step 4. See the Bulk Merge Duplicate People, Companies article for more detail.

step-4-fields-salesforce-accounts.png
I need to exclude some records from deduplication. Can I do that?

Yes. You can exclude records from deduplication by including a "Deduplication Exclude" field in your CSV, as detailed in the Customize Bulk Deduplication Using Exclusions and Pre-Defined Masters article.

My team needs to review and approve the master, can I accommodate that with Insycle?

Yes, there are several ways to share details and get approval before merging duplicates.

You can manually approve master records and mark them in a CSV, then use Insycle to bulk deduplicate down to those master records. Consult with this Customize Bulk Deduplication Using Exclusions and Pre-Defined Masters article to learn more.

Or, you can run the Merge Duplicates module in Preview Mode, then deliver the preview CSV that Insycle generates. The CSV report that Insycle generates includes your entire merge operation down to individual duplicate groups but does not update your live data. Then your team can approve the merge based on this report, before running Merge Duplicates in Update Mode.

Additionally, team members can review duplicates and manually select the master for each record under Step 4. Review the Manually Merge Duplicates article for more detail.

629e926f74a08.png

Do my matching fields have to match each other exactly?

No. The Similar Match Comparison Rule found in Step 1 looks for values that may be close but with a one-character difference (maybe a typo) and broadens the search.

step-1-similar-match-email-only.png

This search behaves like when Google shows results for a slightly different term, or says “Did you mean...” For example, if an Email of, “huey@coahulldu.co” is found, it could include records with the values “hueyy@coahulldu.co," or "hue.y@coahulldu.co,” as a match.

step-2-group-w-similar-match.png

Do pay close attention when using Similar Match as the looser criteria can incorrectly identify non-duplicates as duplicates. 

Review the Understanding Similar Matching best practices for more detail.

Can I deduplicate across leads and contacts in Salesforce?

Yes, Insycle can analyze leads and contacts together and deduplicate across those record types. See the Deduplicate Across Salesforce Leads and Contacts article to learn more.

Can Insycle help me deduplicate while syncing with HubSpot?

Yes, Insycle solves numerous deduplication relates issues when Salesforce and HubSpot are syncing. See the Deduplicate Salesforce and HubSpot While Keeping the Sync Active article to learn more.

Why can I only process 50 duplicate groups at a time?

Insycle shows 50 records on the module screen as a preview, this isn't the entire list of records. Include All records when you view the Preview CSV report to see everything.

Insycle can process thousands of duplicate groups in one operation. Potentially, you could deduplicate your entire database in one operation. 

How many duplicates can I merge into one master record?

You can merge up to 100 duplicate records into a single master record. 

If you have duplicate groups that contain more than five records, you may want to change the value in Skip duplicate groups with more than 5 records per group under Step 3 to make sure you can get them all.

629e926eb6d9b.png

This is a precaution to ensure that if you use a duplicate matching filter that is too broad in Step 1, you do not accidentally merge many non-duplicate records together. If you are going to set this number at a high level, it is a good idea to run Preview Mode first to make sure your deduplication template is operating as you intend.

Are there any limits on the number of records that can be identified and merged with my paid subscription?

All plans include unlimited usage, unlimited users, and unlimited operations. See the pricing page for more details. During the free trial, there is a cap of 500 records updated, cleansed, or merged.

Additional Resources

Related Help Articles

Related Blog Articles

Additional Info