How to Analyze and Merge Duplicates Manually for Granular Deduplication Control
You have duplicate records in your CRM but you need a controlled, careful process to merge these records, so bulk merging isn't an option.
With the Merge Duplicates module, you can surface duplicates, analyze the records to determine which have the relevant data, and merge them manually, one at a time.
Process Summary
- Identify duplicates.
- Review and analyze the duplicates.
- Select the records that need merging.
- Choose the master record.
- Individually select or modify values to be retained.
- Merge duplicate records.
Step-by-Step Instructions
Navigate to the Merge Duplicates module, pick the record type, and explore the default templates for a pre-built solution.
To find duplicates, you need to define how to match records. Step 1 looks through the records in your database, examining the fields that you specify for matches. Each row is for a field you want to look at for duplicates.
For example, to find duplicate Contacts you may use the "First Name," "Last Name," and "Email Domain" fields. Contacts with the same first name AND last name AND email domain will show as possible duplicates.
Choose fields that, in combination, give a high degree of certainty that the matched records are duplicate records.
See the Advanced How-Tos for more details on selecting fields to use and narrowing your results with the filter.
When finished, click the Find button, and Insycle will generate a list of duplicates for you to review.
Expand Criteria for Matching Duplicates
If you'd like to look at the data in two different fields (that contain similar data) as if it were one, you can set up Related Fields under the Advanced tab. For example, you might want to look at both the Email and Additional Email fields for duplicate values.
The Conditions tab provides rules one or more of the records in a duplicate group will need to meet.
- Value Required in All Records - Each record must contain a value in this field to be considered a duplicate.
- Empty Allowed in Any Record - A record can still be considered a duplicate if this field is blank. Allowing empty values requires using two or more fields to identify duplicates.
- At Least One Record With Non-Empty - At least one record in the duplicate group must contain a value.
- At Least One Record Match - At least one record in the duplicate group must match the specified value, and the other records cannot be blank. If none of the records have the specified value, the duplicate group will not be merged.
- Only One Record Match - If more than one record in a duplicate group contains the specified field value, the duplicate group is skipped (not merged).
- Within Timeframe - Set a time parameter that can find duplicates created or modified within a specific timeframe, such as the last 30 minutes.
Records that have the same values in the fields specified in Step 1 are considered matches. When two or more records represent the same entity (person, company, or other), they are clustered together into duplicate groups. Each duplicate group shows the total number of records that were identified as duplicates. For example, if you had four records for the same person, it would count as one duplicate group with four records.
Check the box in a row to expand and see the records in the group.
Explore the record data in the duplicate groups. Double-check to make sure that the fields you set up in Step 1 are showing what you expected.
Add more columns to the view using the gear button on the right to help your analysis.
Select Manual mode to have complete control over which records are merged. You'll work with a single duplicate group through the entire merge process.
When you select Manual, an additional set of checkboxes will appear in Step 2 beside the individual records in each duplicate group. When you check the boxes, you are choosing which records will be merged. The data in unchecked records will not be merged.
Under Step 2, click the checkbox by the duplicate group, then select the individual records you want to merge together.
The master is the record that will remain after the duplicates are merged. If you select three records and merge them, the other two will not exist anymore. By default, data from your chosen master will be retained, and if there are any blank values, this data will automatically be filled in from the other records. If you'd like more control over the data saved in the master record, that is done in Step 5.
Under Step 4, choose the master record that the other records will be merged into.
If you want to control which values are kept, you can choose or edit specific fields under Step 5. This is an optional step—if you don’t pick specific values, the platform's default merging logic will be followed.
Under Step 5, only five fields are initially displayed, so to see all of the fields, change the number of rows shown per page.
There are three tabs showing different field sets:
- The Conflicts tab shows only the fields that have differences in the values, making it easy to focus only on fields that need attention
- The Read Only tab lists only the non-writable fields from your database
- The Full tab shows all of the record fields, even those without data
Use the search to find specific fields.
On a field-by-field basis, select which values to keep and merge into the master. For example, you could choose the Buying Role from one and Company Name from another.
To edit the individual field values one by one, hover over a value you want to update and click the pencil icon.
In the Edit popup, update the value and click Save.
Apply the Merge to Your CRM
When you have all critical fields chosen, click the Merge button at the bottom to merge your selected duplicates.
The Merge Result popup will give you a summary of the merge operation results.
Once you've run the merge operation, Insycle will generate a CSV report and send it to your email. You can use this to review the changes made through Insycle.
With the Activity Tracker, you can download a CSV report at any time that shows all the changes made in a given operation run.
Navigate to Operations > Activity Tracker, search by module, app, or template name. The operations run in Manual Mode are indicated by the Manual checkbox. Click the Run ID for the operation to download the CSV report.
When To Use Manual Merging
Manual merge is great when you have only a handful of duplicates to address, need to merge records carefully, want to employ a manual review process when merging, or just want to explore a few duplicates to understand what you have and how best to merge them.
In most cases, large datasets are a better candidate for bulk deduplication.
For situations where you have a large number of records to fix, but there are no common rules you can apply for identifying duplicates, you can create a CSV and use the Magical Import module.
In these cases, you can use CSV files to customize your bulk merging, designate master records, and exclude records from deduplication. Then you can import the CSV from the Magical Import, and use the Merge Duplicates module for complete control over the final merge operation. Learn how to customize merging Duplicates in bulk using a CSV.
Advanced How-Tos
Each row in your matching fields setup is cumulative, so records must meet all of the criteria. For example, looking for records that have the same First Name and Last Name and Phone Number returns only results where all three values are the same.
The minimum required length for the matching values is four characters or more. Values such as "Joe" or "Ace" will be disregarded.
Pick a field that you think has some duplicate values.
Running a very simple match operation like just First and Last Name is okay for giving you an idea of what you have, but it is too broad to use for reliable analysis and deduplication. There may be legitimate duplicate names–different people with the same first and last name. You need additional, unique criteria to narrow it down.
Choosing Unique Identifiers
Matching duplicates requires unique identifiers—data that is unlikely to be shared by any other record unless it is a duplicate. If you don't use unique identifiers, you are likely to identify unrelated records as duplicates and may accidentally merge them.
Many CRMs match first names, last names, and email addresses. If all of those match, or are similar, you can confidently determine that the record is a duplicate.
Other unique identifying fields that are commonly used in deduplication include:
-
- Phone number
- Domain name
- Mailing address
- ID number
Define what kind of likeness to look for when deciding if field values should be considered a match.
It's a good idea to start with Exact Match and easy-to-find duplicates. Iterate through fields and rules you know will surface duplicates, then look for edge cases. Similar Match can be helpful for finding those.
- Exact Match looks for values that match exactly, with no differences from one record to the next. Any unique identifying fields should use Exact Match.
-
Similar Match looks for values that may be close but with a one-character difference (like a typo, extra character, or missing character) and broadens the search. This search behaves like when Google shows results for a slightly different term, or says “Did you mean...”
For example, if a Company Name of, “Acme” is found, it could include records with the Company Name values “Akme, acm, Acma,” etc., as a match.
Similar Match uses looser criteria that cast a wider net for what can be considered duplicates. It is best to try Similar Match with very open and generic fields after trying everything else. When you do use it, make sure to carefully review the results to ensure the duplicates being identified are what you're expecting.
If using ID fields to identify duplicates, note that they will only work with Exact Match, not Similar Match.
Specify parts of a field value to ignore, such as specific text, whitespace, or characters. These will not be considered part of the matching process.
- Ignore Symbols and Whitespace when comparing phone numbers.
- Ignoring HTTP, www, subdomain, or top-level domain (.com vs co.uk) when comparing websites or email domains is a great way to catch more advanced duplicates.
- Insycle comes preloaded with terms to ignore. If you select Common Terms, click the Terms button to view and edit this list on the Common Terms tab.
- If you select Text (substrings), click the Terms button, then the Ignored Text tab, and enter text to be ignored. Separate multiple substrings (or phrases) with a new line.
Note: If you’ve set up Ignored terms or strings, don’t forget to also enable them. Select the Ignored > Common Terms or Text (substrings) checkbox.
Define specific portions of the field value to compare.
Compare the entire value, the first word, any two words, just the first five characters, last nine characters, etc.
Sometimes, you might want to match duplicates using data in two separate fields. For example, you might want to compare your Phone Number field to a Mobile Phone Number field to identify duplicates.
Using the Related Fields feature, you can use two different fields (that contain similar data) as matching fields to catch more duplicates.
You can set up Related Fields in the Advanced tab.
Common Examples of Related Field Matching
Matching Field | Related Fields |
---|---|
Phone Number | Mobile Phone Number, Company Phone |
Email Domain | Website, Company Domain |
Additional Email Addresses | |
Address | Company Address |
The following unique identifying fields, in combination, give a high degree of certainty that the matched records are truly duplicates that should be merged:
- First Name + Last Name
- Company Name
- Email Domain
- Company Website
- Phone Number
- ID Numbers
Use the filter to work with a segment or smaller pool of records. Then Insycle will only analyze the remaining records for duplicates. To add filters, click the Filter button on Step 1, then choose the field to look at, select the condition, and set the value to look for. The filter is applied before the matching step runs.
You may want to use a filter if:
- You know you only want to work with a subset of your data. In this case, there’s no need to run the operation on your whole database.
- There are an overwhelming number of duplicate results. Add a filter to work with a reasonably sized subset while you work to get the configuration right.
- You want the operation to run faster. A refined segment can speed things up since there are fewer records to analyze.
Most of the options in the Field dropdown match the fields that are found in your CRM, and for Contact records, there are three additional options related to the Email value:
- Email Username: The portion of the email address before the “@.” For example, if the email address were “maria@acmewidgets.com,” the username value would be “maria.”
- Free Email Provider Domain: Choose True to filter out records where the email domain is Gmail, Hotmail, Yahoo, and about 10,000 other free email providers. This filter helps ensure these are real clients, or can determine which record is the legitimate one because it’s most likely customer companies aren't using free Gmail accounts (though a contact may have accidentally emailed us from it at some point).
- Email Top-Level Domain: The top-level domain (TLD) is everything that follows the final dot of a domain name. For example, in the domain name acmewidgets.com', '.com' is the TLD. Some other popular TLDs include '.org', '.uk', and '.edu'.
When customers encounter an issue when trying to make a transaction, they often seek help from one of your support channels. However, whenever a contact is created from a chat, like Facebook Messenger, Hubspot Chat, and others, very little information is provided—usually just a name and timestamp. This makes finding other instances of the same contact, such as their customer record, difficult.
With the Merge Duplicates module, under Step 1, you can use the Conditions tab to match contacts with the same name that were created or modified within the same period of time.
First, select the fields in the Simple tab. Then, on the Conditions tab, select the Within Timeframe condition and set the Minutes, Hours, or Days criteria.
When using two or more fields to identify duplicates, records can still be considered matches even if one of the field values is blank. You just need to specify which field(s) allow a blank value.
Under Step 1, configure your matching rules in the Simple tab, then click the Conditions tab.
All the matching fields you included will automatically appear with the Value Required in All Records condition selected. Change the condition to Empty Allowed in Any Record to allow empty values for certain fields. You can also use the At Least One Record with Non-Empty condition to help you determine which is the master record. Make sure at least one field remains required and is a reliable unique identifier to ensure the records are really duplicates.
For example, on the Simple tab, you may have the matching fields: First Name, Last Name, and Phone Number. But on some of your records, the Phone Number field may be empty. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.
Troubleshooting
Most of the time, when Insycle can't find duplicates, it is due to your matching rules in Step 1. It is important to analyze the underlying data to better understand how to set up your rules. A useful exercise can be to set up your matching filters to look for exact matches of just First Name and Last Name.
When you click Find, these rules can show you a broad overview of what duplicates are potentially in your database and what fields might be useful to include in your matching fields. These settings are just for discovery and should not be used for a final merge operation; many people can have the same first and last names and are not duplicates.
To get further context, on Step 2, click the layout gear button on the right side of the title bar. Here, you can add any field in your database as a column to the duplicate group review to better understand the data inside these records.
It can take a while for Insycle to find and match duplicates if the fields being used to identify them have very long values. The longer the values, the longer it takes Insycle to process the data and generate the results. This might come up when looking for matches based on long ID numbers, LinkedIn bio links, or other URLs with long strings attached (ex, https://www.linkedin.com/in/svadin%C3%ADr-n%C4%9Bmec-1234b31a3/).
You can speed this up by limiting how much of the value Insycle looks at.
If the beginning or ending portion of the values are all unique, you can limit the comparison to the first or last several characters using the Match Parts parameter under Step 1.
Or use the Ignore Text (Substrings) parameter, then click the Terms button.
On the Ignored Text tab of the popup, add the common portion of the URL or text string.
For a complete guide to troubleshooting issues with Insycle, please refer to our article on Troubleshooting Issues.
Frequently Asked Questions
Yes. You can select individual records within a duplicate group for manual merging. Under Step 2, select the duplicate group. Then, select the records that you would like to merge.
To help you analyze and determine which records are the right ones to merge, you can change the fields that show up in this preview under Step 2 by clicking on the gear button to alter the layout.
Additional Resources
Related Help Articles
- Module Overview: Merge Duplicates
- Deduplication Best Practices
- Customize Bulk Deduplication Using Exclusions and Pre-defined Masters
- Deduplicate Salesforce Contacts, Leads, Accounts, and Other Objects in Bulk
- Deduplicate HubSpot Contacts, Companies, and Deals in Bulk
Related Blog Posts