How to Analyze and Merge Duplicates Manually for Granular Deduplication Control
You have duplicate records in your CRM but you need a controlled, careful process to merge these records, so bulk merging isn't an option.
With the Merge Duplicates module, you can surface duplicates, analyze the records to determine which have the relevant data, and merge them manually, one at a time.
- Identify duplicates.
- Review and analyze the duplicates.
- Select the records that need merging.
- Choose the master record.
- Individually select values to be retained.
- Merge duplicate records.
Navigate to the Merge Duplicates module, pick the record type, and explore the default templates for a pre-built solution.
To find duplicates, you need to define how to match records. Step 1 looks through the records in your database, examining the fields that you specify for matches. Each row is for a field you want to look at for duplicates.
For example, to find duplicate Contacts you may use the "First Name," "Last Name," and "Email Domain" fields. Contacts with the same first name AND last name AND email domain will show as possible duplicates.
Choose fields that, in combination, give a high degree of certainty that the matched records are duplicate records.
See the Advanced How-Tos for more detail on selecting fields to use and narrowing your results with the filter.
When finished, click the Find button and Insycle will generate a list of duplicates for you to review.
Expand Criteria for Matching Duplicates
If you'd like to look at the data in two different fields (that contain similar data) as if it were one, you can set up Related Fields under the Advanced tab. For example, you might want to look at both the Email and Additional Email fields for duplicate values.
The Conditions tab provides rules one or more of the records in a duplicate group will need to meet.
- At Least One Record With Non-Empty - At least one record in the duplicate group must contain a value.
- Value Required in All Records - Each record must contain a value in this field to be considered a duplicate.
- At Least One Record Match - At least one record in the duplicate group must match the specified value, and the other records cannot be blank. If none of the records have the specified value, the duplicate group will not be merged.
- Empty Allowed in Any Record - A record can still be considered a duplicate if this field is blank. Allowing empty values requires using two or more fields to identify duplicates.
Records that have the same values in the fields specified in Step 1 are considered matches. When two or more records represent the same entity (person, company, or other), they are clustered together into duplicate groups. Each duplicate group shows the total number of records that were identified as duplicates. For example, if you had four records for the same person, it would count as one duplicate group with four records.
Check the box in a row to expand and see the records in the group.
Explore the record data in the duplicate groups. Double-check to make sure that the fields you set up in Step 1 are showing what you expected.
Add more columns to the view using the gear button on the right to help your analysis.
Select Manual mode to have complete control over which records are merged. You'll work with a single duplicate group through the entire merge process.
When you select Manual, an additional set of checkboxes will appear in Step 2 beside the individual records in each duplicate group. When you check the boxes, you are choosing which records will be merged. The data in unchecked records will not be merged.
Under Step 2, click the checkbox by the duplicate group, then select the individual records you want to merge together.
The master is the record that will remain after the duplicates are merged. If you select three records and merge them, the other two will not exist anymore. By default, data from your chosen master will be retained, and if there are any blank values, this data will automatically be filled in from the other records. If you'd like more control over the data saved in the master record, that is done in Step 5.
Under Step 4, choose the master record that the other records will be merged into.
If you want to control which values are kept, you can choose specific fields under Step 5. This is an optional step—if you don’t pick specific values, the platform's default merging logic will be followed.
Under Step 5, the Conflicts tab shows only the fields that have differences in the values, making it easy to focus only on fields that need attention. The Read Only tab lists only the non-writable fields from your database, and the Full tab shows all of the record fields, even those without data. Use the search to find specific fields.
Only five fields are initially displayed, so to see all of the fields, change the number of rows shown per page.
On a field-by-field basis, select which values to keep and have merged into the master. For example, you could choose Contact Owner from one and Company Name from another.
Apply the Merge to Your CRM
When you have all critical fields chosen, click the Merge button at the bottom to merge your selected duplicates.
When To Use Manual Merging
Manual merge is great when you have only a handful of duplicates to address, need to merge records carefully, want to employ a manual review process when merging, or just want to explore a few duplicates to understand what you have and how best to merge them.
In most cases, large datasets are a better candidate for bulk deduplication.
For situations where you have a large number of records to fix and need granular control for picking records to include - or exclude - from the deduplication process, or for picking the master record, and there are no common rules you can apply for all or some of the records, you can customize bulk deduplication using exclusions and pre-defined masters via a CSV file.
For each row in your Find Duplicates setup, you'll configure the following:
Pick a field that you think has some duplicate values.
Running a very simple match operation like just First and Last Name can be helpful in giving you an idea of what you have, but it is too broad to use for reliable analysis and deduplication. There may be legitimate people with the same first and last name. You need additional, unique criteria to narrow it down.
Choosing Unique Identifiers
Matching duplicates requires unique identifiers—data that is unlikely to be shared by any other record unless it is a duplicate. If you don't use unique identifiers, you are likely to identify unrelated records as duplicates and may accidentally merge them.
Many CRMs match first names, last names, and email addresses. If all of those match, or are similar, you can confidently determine that the record is a duplicate.
Other unique identifying fields that are commonly used in deduplication include:
- Phone number
- Mailing address
- ID numbers
Define what kind of likeness to look for when deciding if field values should be considered a match.
It's a good idea to start with Exact Match, and begin with easy-to-find duplicates. Iterate through fields and rules you know will surface duplicates, then look for edge cases. Similar Match can be helpful for those edge cases.
- Exact Match looks for values that match exactly, with no differences from one record to the next. Any unique identifying fields should use Exact Match.
Similar Match looks for values that may be close but with a one-character difference (like a typo, extra character, or missing character) and broadens the search. This search behaves like when Google shows results for a slightly different term, or says “Did you mean...” For example, if a Company Name of, “Acme” is found, it could include records with the Company Name values “Akme, acm, Acma,” etc., as a match.
Be careful when using Similar Match, as the looser criteria can incorrectly identify non-duplicates as duplicates. It is best to only use Similar Match with very open and generic fields, and only after trying everything else.
*Note that ID fields & phone numbers can only be Exact Match, never Similar Match.
Specify parts of a field value to ignore, such as specific text, whitespace, or characters. These won’t be considered as part of the matching process.
- Ignore Symbols and Whitespace when comparing phone numbers.
- Ignore HTTP, www, subdomain, or top-level domain (.com vs co.uk) when comparing websites or email domains is a great way to catch more advanced duplicates.
- Insycle comes preloaded with terms to ignore. If you select Common Terms, click the Terms button to view and edit this list on the Common Terms tab.
- If you select Text (substrings), click the Terms button, then the Ignored Text tab, and enter text to be ignored. Separate multiple substrings (or phrases) with a new line.
Note: If you’ve set up Ignored terms or strings, don’t forget to also enable them. Select the Ignored > Common Terms or Text (substrings) checkbox.
Define specific portions of the field value to compare.
Compare the entire value, the first word, any two words, just the first five letters, last nine characters, etc.
Each row in your matching fields setup is cumulative, so records must meet all of the criteria. For example, looking for records that have the same First Name, AND Last Name, AND Phone Number returns only results where all three values are the same.
To match against one field value OR another, you will need to run two different templates. For example, if you want to use fields like Phone Number OR Mobile Phone Number, you’ll run one template for Phone Number, then a second configured the same except with the Mobile Phone Number field.
The searched value must have four or more characters. For example, values of “Joe” will be ignored.
The following unique identifying fields, in combination, give a high degree of certainty that the matched records are truly duplicates that should be merged:
- First Name + Last Name
- Company Name
- Email Domain
- Company Website
- Phone Number
- ID Numbers
Use the filter to work with a segment or smaller pool of records. Then Insycle will only analyze the remaining records for duplicates. To add filters, click the Filter button, then choose the field to look at, select the condition, and set the value to look for. The filter is applied before the matching step runs.
You may want to use a filter if:
- You know you only want to work with a subset of your data. In this case, there’s no need to run the operation on your whole database.
- There are an overwhelming number of duplicate results. Add a filter to work with a reasonably sized subset while you work to get the configuration right.
- You want the operation to run faster. A refined segment can speed things up since there are fewer records to analyze.
Most of the options in the Field dropdown match the fields that are found in your CRM, and for Contact records, there are three additional options related to the Email value:
- Email Username: The portion of the email address before the “@.” For example, if the email address were “email@example.com,” the username value would be “maria.”
- Free Email Provider Domain: Choose True to filter out records where the email domain is Gmail, Hotmail, Yahoo, and about 10,000 other free email providers. This filter helps ensure these are real clients, or can determine which record is the legitimate one because it’s most likely customer companies aren't using free Gmail accounts (though a contact may have accidentally emailed us from it at some point).
- Email Top-Level Domain: The top-level domain (TLD) is everything that follows the final dot of a domain name. For example, in the domain name acmewidgets.com', '.com' is the TLD. Some other popular TLDs include '.org', '.uk', and '.edu'.
Sometimes, you might want to match duplicates using data in two separate fields. For example, you might want to compare your Phone Number field to a Mobile Phone Number field to identify duplicates.
Using the Related Fields feature, you can use two different fields (that contain similar data) as matching fields to catch more duplicates.
You can set up Related Fields in the Advanced tab.
Common Examples of Related Field Matching
|Matching Field||Related Fields|
|Phone Number||Mobile Phone Number, Company Phone|
|Email Domain||Website, Company Domain|
When using two or more fields to identify duplicates, records can still be considered matches even if one of the field values is blank. You just need to specify which field(s) allow a blank value.
Under Step 1, configure your matching rules in the Simple tab, then click the Conditions tab.
All the matching fields you included will automatically appear with the Value Required in All Records condition selected. Change the condition to Empty Allowed in Any Record to allow empty values for certain fields. You can also use the At Least One Record with Non-Empty condition to help you determine which is the master record. Make sure at least one field remains required and is a reliable unique identifier to ensure the records are really duplicates.
For example, on the Simple tab, you may have the matching fields: First Name, Last Name, and Phone Number. But on some of your records, the Phone Number field may be empty. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.
Most of the time when Insycle can't find duplicates, it is due to your matching rules in Step 1. To better understand how to set up your rules, it is important to analyze the underlying data. A useful exercise can be to set up your matching filters to look for exact matches of just First Name and Last Name.
When you click the Find button, these rules can show you a broad overview of what duplicates are potentially in your database, and what fields might be useful to include in your matching fields. These settings are just for discovery and should not be used for a final merge operation; many people can have the same first and last names and are not duplicates.
To get further context, click the gear button on the right side of the Record Viewer pane. Here, you can add any field in your database as a column to the Record Viewer to better understand the data inside of these records.
It can take a while for Insycle to find and match duplicates if the fields being used to identify them have very long values. The longer the values, the longer it takes Insycle to process the data and generate the results. This might come up when looking for matches based on long ID numbers, LinkedIn bio links, or other URLs with long strings attached (ex, https://www.linkedin.com/in/svadin%C3%ADr-n%C4%9Bmec-1234b31a3/).
You can speed this up by limiting how much of the value Insycle looks at.
If the beginning or ending portion of the values are all unique, you can limit the comparison to the first or last several characters using the Match Parts parameter under Step 1.
Or use the Ignore Text (Substrings) parameter, then click the Terms button.
On the Ignored Text tab of the popup, add the common portion of the URL or text string.
For a complete guide to troubleshooting issues with Insycle, please refer to our article on Troubleshooting Issues.
Frequently Asked Questions
Yes. You can select individual records within a duplicate group for manual merging. Under Step 2, select the duplicate group. Then, select the records that you would like to merge.
To help you analyze and determine which records are the right ones to merge, you can change the fields that show up in this preview under Step 2 by clicking on the gear button to alter the layout.
Related Help Articles
- Module Overview: Merge Duplicates
- Deduplication Best Practices
- Customize Bulk Deduplication Using Exclusions and Pre-defined Masters
- Deduplicate Salesforce Contacts, Leads, Accounts, and Other Objects in Bulk
- Deduplicate HubSpot Contacts, Companies, and Deals in Bulk
Related Blog Articles