Specialized Strategies to Find and Merge More Duplicates
Clean, accurate customer data is the foundation of effective sales, marketing, and customer service operations. Yet, duplicate records remain one of the most persistent challenges in CRM data management. While basic deduplication may catch obvious duplicates, not everything in your database will be so straightforward.
Insycle's advanced deduplication capabilities go beyond simple matching to tackle these complex scenarios. This article explores deduplication techniques that address everyday challenges like handling incomplete data, working with records from multiple systems, processing specific record segments, and identifying duplicates with inconsistent field values.
In real-world CRM data, information is rarely complete. Customer records may lack phone numbers, have incomplete addresses, or contain empty fields due to data being collected from various sources, by different people, or through different processes. Without flexible record matching, you might miss obvious duplicates if one field is empty.
Insycle's Empty Allowed in Any Record condition gives you the flexibility to identify these partial matches while still maintaining enough matching criteria to ensure accuracy.
In the Merge Duplicates module, on the Simple tab of Step 1, set up your matching fields to identify potential duplicates. To allow one field to be blank, you must have at least two matching fields. One field must remain required and serve as a reliable, unique identifier to confirm that the records are indeed duplicates.
On the Conditions tab, for the flexible field, set the condition to Empty Allowed in Any Record. Alternatively, you can use the At Least One Record with Non-Empty condition.
For example, on the Simple tab, you may have set up the matching fields:
- First Name
- Last Name
- Phone Number
However, the Phone Number field may be empty on some of your records. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.
For more detailed steps, see Merge Duplicates with Blank Fields.
Insycle’s Related Fields matching is useful whenever important customer data might exist across different field types in your CRM. This approach is particularly valuable for phone numbers stored in multiple fields, email addresses captured through different channels, and physical addresses that might be split between billing and shipping records.
In the Merge Duplicates module, on the Advanced tab of Step 1, set up your matching fields to identify potential duplicates.
Select your primary matching field (e.g., "Phone Number"), and in the Related Fields dropdown, select one or more related fields you want to include in the matching (e.g., "Mobile Phone Number," “WhatsApp Phone Number”).
When you run the deduplication process, Insycle will now check for matches across both your primary field and all related fields you've specified.
Common Examples of Related Field Matching
Matching Field | Related Fields |
Additional Email Addresses, Company Email Address | |
Phone Number | Mobile Phone Number, Company Phone |
Email Domain | Website, Company Domain |
Billing Address | Shipping Address |
CRM databases often contain records that represent the same entity despite having inconsistent data. The Values Don't Match condition in Insycle's Merge Duplicates module specifically targets these hard-to-find duplicates by identifying records that match on critical identifiers but have differences in other fields.
This powerful approach surfaces probable duplicates that standard exact-matching methods would miss.
For example, you can identify:
- Company records with different names due to acquisitions but sharing the same domain
- Contact records with varying job titles but matching email addresses
- Product records where naming conventions have changed but share the same product code
To set this up:
In Step 1 of the Merge Duplicates module, set up your matching fields. Then, on the Conditions tab, apply the Values Don't Match condition to specific fields where you expect there might be inconsistencies:
- Field 1: Company Name
- Condition 1: Values Don't Match
- Field 2: Company Domain
- Condition 2: Value Required in All Records
This approach would identify all the probable duplicate companies that require review, even though they have different name values.
When customers encounter an issue when trying to make a transaction, they often seek help from one of your support channels. However, whenever a contact is created from a chat, like Facebook Messenger, Hubspot Chat, and others, very little information is provided—usually just a name and timestamp. This makes finding other instances of the same contact, such as their customer record, difficult.
With the Merge Duplicates module, under Step 1, you can use the Conditions tab to match contacts with the same name that were created or modified within the same period of time.
First, select the fields in the Simple tab. Then, on the Conditions tab, select the Within Timeframe condition and set the Minutes, Hours, or Days criteria.
When managing your CRM data, you often need to verify whether a specific set of records exists, like when your marketing team has imported 500 leads from a trade show. You want to determine if any of these leads are already in your CRM without running a full database deduplication.
Insycle provides a couple of ways for you to target just a specific subset of records and identify their potential matches in your existing database.
Option One: Create a Workflow
If you are working with HubSpot or Salesforce, you can create a workflow that enrolls only specific records into the Insycle deduplication operation. This approach effectively targets records based on dynamic criteria that may exist in your CRM. Note that these options are not suitable for large volumes of records, as workflows can considerably slow down systems.
For example, in HubSpot, you could develop a workflow that enrolls contacts created on a specific date, from a particular form submission, or with a designated lead source value. The workflow would then trigger an Insycle Deduplicate Recipe to check just these records against your existing database.
Option Two: Use the "Only One Record Match" Condition
In Insycle, the Only One Record Match condition requires that exactly one record in each duplicate group meets a specific criterion. If multiple records in a duplicate group have the specified field value, that group will be bypassed and not merged. This approach is effective for large volume deduplication (hundreds or thousands of records) or recurring batch deduplication operations.
You can create a custom field and apply it to your target records by following these steps:
- Create a custom field in your CRM. This can be a picklist or text field, such as "Custom Tag," or a true/false field, like "Needs Dedup Check."
- Update this field for your target records to a named identifier, such as “2025.02.14 Expo Leads" or, if true/false, "True."
- In Insycle, under Step 1 on the Conditions tab, set up this field condition:
- Field: [Your field name]
- Condition: Only One Record Match
- Value: [Your custom identifier]
When deduplicating your CRM data, it’s important to preserve records that sync with external systems like Salesforce to avoid disrupting integrations. Records that sync with external systems often contain vital integration IDs and relationship data. If these records are merged or deleted, it may disrupt connections between your systems, leading to data sync failures or loss of critical information.
Insycle provides an easy way to exclude these synced records from your deduplication process.
How to Exclude Records with Integration IDs
In the Merge Duplicates module, set up your matching fields in Step 1 to identify potential duplicates.
Before running the operation, click the Filter button at the bottom of Step 1.
In the Advanced Settings popup, select the appropriate integration ID field (e.g., "Salesforce Account ID" or "Shopify Customer ID"), and select the condition doesn't exist to only include records that don’t have integration IDs.
Click Search to apply your filter.
Now, when you click Find in Step 1, this configuration ensures your deduplication process only affects records that don't have external system connections, preserving your important integration points while still cleaning up purely internal duplicates.
When deduplicating records, some fields contain critical information essential for your business operations. It is important to ensure that merged records retain this necessary data.
For example, you may want to ensure that when merging customer records, at least one has integrated system IDs, a lead score, an assigned account manager, payment information, or subscription status. Without this condition, you risk accidentally merging records where none contain this essential data.
The At Least One Record With Non-Empty condition in Insycle's Merge Duplicates module ensures that at least one record in a duplicate group contains a value in a specified field. Any duplicate groups where all records have an empty field will be skipped during the merge process.
Configuring the Non-Empty Field Requirement
In Step 1 on the Simple tab, set up your matching fields to identify potential duplicates. Additionally, include the field you want to ensure has a value.
On the Conditions tab, for the key field, set the condition to At Least One Record With Non-Empty.
You often need to apply multiple templates for different duplicate scenarios, but you don’t want to reprocess records you’ve already addressed. To resolve this, you can use the Values Don’t Match condition in Step 1.
For example, you have run a template to identify and merge records with exactly matching email addresses.
To apply different criteria while skipping already processed records, configure matching rules for your next template to address the additional criteria.
In Step 1 on the Simple tab, add the field criteria you want to use to identify duplicates. Additionally, include the field from your previous template, which is Email in this case.
On the Conditions tab, for the Email field, set the condition to Values Don’t Match to exclude records already processed by the first template.
When you need detailed control over which records are merged during deduplication, Insycle enables you to customize the process with CSV files. This approach is especially useful when:
- You need to designate specific records as masters
- Some records need to be excluded from the deduplication process
- Your records don't have common identifiers for automatic matching
Start by generating a preview CSV of duplicates in the Merge Duplicates module. Edit the CSV to include "Deduplication Exclude" and "Deduplication Master" columns, marking TRUE for records that should be excluded or designated as masters. Create custom boolean/checkbox fields in your CRM to store these designations, and then use the Magical Import to tag the records. Lastly, utilize the Merge Duplicates module to finalize the process based on the tags.
This method provides complete control over which records are merged and which are maintained as masters, even in complex deduplication scenarios where standard matching rules may fall short. Discover how to customize the bulk merging of duplicates using a CSV.
Additional Scenarios
Depending on your scenario, there are several ways to exclude records from the deduplication process using Insycle.
Option One: Filter Out Records with Specific Properties
The simplest way to exclude records is to use a filter in Step 1. After you’ve set up your matching fields, click the Filter button. Select the field and set up the condition that identifies records you want to avoid.
In this example, records with the Salesforce Account ID value present will be omitted from the deduplication.
Option Two: Manually Label Records to Exclude with a CSV
For situations where there are no common rules to identify duplicates for all or some records, you may need more granular control for picking records to include or exclude from the process.
In these cases, you can use CSV files to customize your bulk merging, designate master records, and exclude records from deduplication. Afterward, you can import the CSV using the Magical Import feature and utilize the Merge Duplicates module for complete control over the final merging operation. Discover how to customize merging duplicates in bulk using a CSV.
Option Three: Skip Records You’ve Already Deduplicated
If you have already run your standard template and now want to target variations without reprocessing previously handled records, you can resolve this by using the Values Don’t Match condition in Step 1.
On your follow-up template, configure matching rules to address the additional criteria and skip the previous one:
In Step 1 on the Simple tab, add the field criteria you want to use to identify duplicates. Additionally, include the field from your previous template, which is Email in the example.
On the Conditions tab, for the Email field, set the condition to Values Don’t Match to exclude records already processed by the first template.
Sometimes, you might want to match duplicates using data in two separate fields. For example, you might want to compare an Email Address field to an Additional Email Addresses field to identify duplicates.
Using the Related Fields feature, you can use two different fields (that contain similar data) as matching fields to catch more duplicates.
You can set up Related Fields in the Advanced tab of Step 1.
When using two or more fields to identify duplicates, records can still be considered matches even if one of the field values is blank. You just need to specify which field(s) allow a blank value.
Under Step 1, configure your matching rules in the Simple tab, then click the Conditions tab.
All the matching fields you included will automatically appear when the Value Required in All Records condition is selected. Change the condition to Empty Allowed in Any Record to allow empty values for certain fields. You can also use the At Least One Record with Non-Empty condition to help you determine which is the master record. Make sure at least one field remains required and is a reliable, unique identifier to ensure the records are really duplicates.
For example, on the Simple tab, you may have the matching fields: First Name, Last Name, and Phone Number. But on some of your records, the Phone Number field may be empty. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.
When customers encounter an issue when trying to make a transaction, they often seek help from one of your support channels. However, whenever a contact is created from a chat, such as Facebook Messenger or Hubspot Chat, very little information is provided—usually just a name and a timestamp. This makes it difficult to find other instances of the same contact, such as their customer record.
With the Merge Duplicates module, under Step 1, you can use the Conditions tab to match contacts with the same name that were created or modified within the same period of time.
First, select the fields in the Simple tab. Then, on the Conditions tab, select the Within Timeframe condition and set the Minutes, Hours, or Days criteria.
Additional Resources
Related Help Articles
- Module Overview: Merge Duplicates
- Deduplication Best Practices
- Merge Duplicates with Blank Fields
- Customize Bulk Deduplication Using Exclusions and Pre-Defined Masters
- Manually Merge Duplicates
Related Blog Posts