Deduplication Scenarios: Advanced Techniques for Duplicate Management

data-monster-duplicates-nyc-street.png

Specialized Strategies to Find and Merge More Duplicates

Clean, accurate customer data is the foundation of effective sales, marketing, and customer service operations. Yet, duplicate records remain one of the most persistent challenges in CRM data management. While basic deduplication may catch obvious duplicates, not everything in your database will be so straightforward.

Insycle's advanced deduplication capabilities go beyond simple matching to tackle these complex scenarios. This article explores deduplication techniques that address everyday challenges like handling incomplete data, working with records from multiple systems, processing specific record segments, and identifying duplicates with inconsistent field values.

Identify Duplicates Despite Blank Fields

In real-world CRM data, information is rarely complete. Customer records may lack phone numbers, contain incomplete addresses, or include empty fields because data was collected from various sources, by different people, or through different processes. Without flexible record matching, you might miss obvious duplicates if one field is empty.

Insycle's Empty Allowed in Any Record condition gives you the flexibility to identify these partial matches while still maintaining enough matching criteria to ensure accuracy.

To allow one field to be blank, you must have at least two matching fields. One field must remain required and serve as a reliable, unique identifier to confirm that the records are indeed duplicates.

To set the condition on a field:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. 
  3. Click the Conditions tab.
  4. Select the Condition option to configure the flexible field. Set the condition to Empty Allowed in Any Record. Alternatively, you can use the At Least One Record with Non-Empty condition.

Note that you must always have at least one field set to "Value Required in All Records".

step-1-conditions-empty-not-empty.png

For example, on the Simple tab, you may have set up the matching fields:

  • First Name
  • Last Name
  • Phone Number

If the Phone Number field might be empty in some of your records, using 'Empty Allowed in Any Record' or 'At Least One Record with Non-Empty' will consider all records with the same name and phone number, or no phone number, as duplicates.

step-1-allow-empty-review.png

For more detailed steps, see Merge Duplicates with Blank Fields.

Search Data Across Multiple Fields with Related Matching

Insycle’s Related Fields matching is useful whenever important customer data might exist across different field types in your CRM. This approach is particularly valuable for phone numbers stored in multiple fields, email addresses captured through different channels, and physical addresses that might be split between billing and shipping records.  

To add related fields:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. 
  3. Click the Advanced tab.
  4. In the Related Fields dropdown, select up to 2 fields to group for matching.

For example, if your matching field is "Phone Number", you could select "Mobile Phone Number," and “WhatsApp Phone Number” from the Related Fields dropdown. When you run the deduplication process, Insycle will now check for matches in all three phone fields.

step-1_related-field.png

Common Examples of Related Field Matching

Matching Field Related Fields
Email Additional Email Addresses, Company Email Address
Phone Number Mobile Phone Number, Company Phone
Email Domain Website, Company Domain
Billing Address Shipping Address 
Find Records with Inconsistent Values That Are Probably Duplicates

CRM databases often contain records that represent the same entity despite having inconsistent data. The Values Don't Match condition in Insycle's Merge Duplicates module specifically targets these hard-to-find duplicates by identifying records that match on critical identifiers but have differences in other fields.

This powerful approach surfaces probable duplicates that standard exact-matching methods would miss.

For example, you can identify:

  • Company records with different names due to acquisitions but sharing the same domain
  • Contact records with varying job titles but matching email addresses
  • Product records where naming conventions have changed but share the same product code

To set this up:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. 
  3. Click the Conditions tab.
  4. Select the Condition option, Values Don't Match, for fields where you suspect there might be inconsistencies.

Note that you must always have at least one field set to "Value Required in All Records".

For example, if you are looking for duplicate companies but notice a lot of variety in the website URL, you could configure the conditions thusly:

  • Field 1: Company Name
    • Condition 1: Values Don't Match
  • Field 2: Company Domain
    • Condition 2: Value Required in All Records

This approach would identify all the probable duplicate companies that require review, even though they have different name values.

merge-duplicates-hubspot-companies-step-1-conditions-name-dont-match-646px.png

Find Duplicates Created Within the Same Timeframe

When customers face an issue while attempting a transaction, they often reach out to your support channels. However, when a contact is created through a chat, like Facebook Messenger, Hubspot Chat, or others, limited information is usually provided—typically just a name and timestamp. This makes it hard to find other instances of the same contact, such as their customer record.

With the Merge Duplicates module, you can use Conditions to match contacts with the same name that were created or modified within the same period of time.

To set this up:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. 
  3. Click the Conditions tab.
  4. Select the Condition option, "Within Timeframe," and set the Minutes, Hours, or Days criteria. 

merge-duplicates-hubspot-contacts-step-1-conditions-created-within-15-minutes.png

Deduplicate a Segment Against the Rest of the CRM

When managing your CRM data, you often need to verify whether a specific set of records exists, like when your marketing team has imported 500 leads from a trade show. You want to determine if any of these leads are already in your CRM without running a full database deduplication.

Insycle provides a couple of ways for you to target just a specific subset of records and identify their potential matches in your existing database. 

Option One: Create a Workflow

If you are working with HubSpot or Salesforce, you can create a workflow that enrolls only specific records into the Insycle deduplication operation. This approach effectively targets records based on dynamic criteria that may exist in your CRM. Note that these options are not suitable for large volumes of records, as workflows can considerably slow down systems.

For example, in HubSpot, you could develop a workflow that enrolls contacts created on a specific date, from a particular form submission, or with a designated lead source value. The workflow would then trigger an Insycle Deduplicate Recipe to check just these records against your existing database.

hubspot-workflow-contact-enroll-only-specific-date-646px.png

Option Two: Use the "Only One Record Match" Condition

In Insycle, the Only One Record Match condition requires that exactly one record in each duplicate group meets a specific criterion. If multiple records in a duplicate group have the specified field value, that group will be bypassed and not merged. This approach is effective for large volume deduplication (hundreds or thousands of records) or recurring batch deduplication operations.

You can create a custom field and apply it to your target records by following these steps:

In your CRM:

  1. Create a custom field in your CRM. This can be a picklist or text field, such as "Custom Tag," or a true/false field, like "Needs Dedup Check."
  2. Update this field for your target records to a named identifier, such as “2025.02.14 Expo Leads" or, if true/false, "True."

In Insycle:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. 
  3. Click the Conditions tab.
  4. Set up this Condition:
    • Field: [Your field name]
    • Condition: Only One Record Match
    • Value: [Your custom identifier]

merge-duplicates-hubspot-contacts-step-1-conditions-only-one match-custom-tag-646px.png

Exclude Records that Sync with Other Systems

When deduplicating your CRM data, it’s important to preserve records that sync with external systems like Salesforce to avoid disrupting integrations. Records that sync with external systems often contain vital integration IDs and relationship data. If these records are merged or deleted, it may disrupt connections between your systems, leading to data sync failures or loss of critical information.

Insycle provides an easy way to exclude these synced records from your deduplication process.

How to Exclude Records with Integration IDs

In the Merge Duplicates module, set up your matching fields in Step 1 to identify potential duplicates.

Before running the operation, click the Filter button at the bottom of Step 1.

merge-duplicates-hubspot-companies-step-1-filter-arrow-646px.png

In the Advanced Settings popup, select the appropriate integration ID field (e.g., "Salesforce Account ID" or "Shopify Customer ID"), and select the condition doesn't exist to only include records that don’t have integration IDs.

Click Search to apply your filter.

merge-duplicates-hubspot-companies-step-1-filter-no-salesforce-ID-646px.png

Now, when you click Find in Step 1, this configuration ensures your deduplication process only affects records that don't have external system connections, preserving your important integration points while still cleaning up purely internal duplicates.

Ensure Critical Data Exists in at Least One Record

When deduplicating records, some fields contain critical information essential for your business operations. It is important to ensure that merged records retain this necessary data.

For example, you may want to ensure that when merging customer records, at least one has integrated system IDs, a lead score, an assigned account manager, payment information, or subscription status. Without this condition, you risk accidentally merging records where none contain this essential data.

The At Least One Record With Non-Empty condition in Insycle's Merge Duplicates module ensures that at least one record in a duplicate group contains a value in a specified field. Any duplicate groups where all records have an empty field will be skipped during the merge process.

To configure the non-empty field requirement:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates, including the field you want to ensure has a value.
  3. Click the Conditions tab.
  4. Select the Condition option, At Least One Record With Non-Empty for the key field.

merge-duplicates-hubspot-leads-step-1-conditions-one-non-empty-646px.png

Use Conditional Filters to Prevent Redundant Processing

You often need to apply multiple templates for different duplicate scenarios, but you don’t want to reprocess records you’ve already addressed. To resolve this, you can use the Values Don’t Match condition in Step 1

For example, you have run a template to identify and merge records with exactly matching email addresses.

To apply different criteria while skipping already processed records, configure matching rules for your next template to address the additional criteria.

To tell Insycle what records to exclude:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. Additionally, include the field from your previous template. In this example, we're using Email.
  3. Click the Conditions tab.
  4. For the Email field, select the Condition, "Values Don’t Match" to exclude records already processed by the first template.

merge-duplicates-hubspot-contacts-step-1-conditions-email-dont-match-646px.png

Customize Which Records Are Merged Using a CSV

When you need detailed control over which records are merged during deduplication, Insycle gives you two ways to customize the process using CSV files.

Option A: Merge Using a CSV of Record ID Pairs 

If you already know which records are duplicates, you can upload a CSV of record ID pairs directly into the Merge Duplicates module—no custom CRM fields needed. The CSV must include a header row with two columns, ID_1 and ID_2, and each row should contain the IDs of a single duplicate pair.

merge-duplicates-ID_1-ID_2-pairs-merge-csv.png

|n Step 1 of the Merge Duplicates module, click the CSV tab, upload your file, and proceed through the module to set up your merge options and execute the operation. This method is compatible with any supported CRM, including HubSpot and Salesforce.

merge-duplicates-step-1-csv-tab-646w.png

Option B: Merge Using Custom CRM Fields

 This approach is especially useful when:

  • You need to exclude specific records from the merge on a record-by-record basis 
  • You need to manually designate which record in a group becomes the master, rather than relying on automated master selection rules

Start by generating a preview CSV of duplicates in the Merge Duplicates module. Edit the CSV to include "Deduplication Exclude" and "Deduplication Master" columns, marking TRUE for records that should be excluded or designated as masters. 

cus2.png

Next, create custom boolean/checkbox fields in your CRM to store these designations, and then use the Magical Import to tag the records. Lastly, use the Merge Duplicates module to finalize the process based on the tags.

Both methods provide complete control over which records are merged and which are maintained as masters, even in complex deduplication scenarios where standard matching rules may fall short. Discover how to customize the bulk merging of duplicates using a CSV.

Exclude Some Records from Deduplication

Depending on your scenario, there are several ways to exclude records from the deduplication process using Insycle.

Option One: Exclude Duplicate Groups Directly in Step 2

If you've already run duplicate detection and want to permanently prevent specific groups from being merged in future runs, you can exclude them directly in Step 2. This is useful when a group has been reviewed and confirmed as not representing true duplicates.

In Step 2, click the X on a duplicate group row to add it to the Exclusion List. Excluded groups will not appear in duplicate analysis or be included in merges, even when using different matching rules or templates.

merge-duplicates-hubspot-contacts-step-2-exclude-group-button-646px.png

To review and manage your excluded groups, click the Exclusions button in the Step 2 header. From there, you can expand each group to inspect the records, configure which fields are displayed using the Layout tab, and remove groups from the list if needed. Removing a group from the Exclusion List allows those records to be considered again during future duplicate analysis.

Keep in mind that exclusions apply to a specific set of record IDs — if a new duplicate record is later added to your CRM, the system may detect a new group containing that record alongside previously excluded records, and that new group will appear in duplicate analysis.

Learn more about using the Exclude feature.

Option Two: Skip Records You’ve Already Deduplicated

If you have already run your standard template and now want to target variations without reprocessing previously handled records, you can resolve this by using the Values Don’t Match condition in Step 1.

On your follow-up template, configure matching rules to address the additional criteria and skip the previous one:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. Additionally, include the field from your previous template. In this example, we're using Email.
  3. Click the Conditions tab.
  4. On the Conditions tab, for the Email field, set the Condition to "Values Don’t Match" to exclude records already processed by the first template.

merge-duplicates-hubspot-contacts-step-1-conditions-email-dont-match-646px.png

Learn more about configuring Conditions in Step 1.

Option Three: Filter Out Records with Specific Properties

The simplest way to exclude records is to use a filter in Step 1. After you’ve set up your matching fields, click the Filter button. Select the field and set the condition that identifies the records you want to avoid.

In this example, records with a Salesforce Account ID value present will be omitted from deduplication.

merge-duplicates-hubspot-companies-step-1-filter-no-salesforce-ID-646px.png

Option Four: Manually Label Records to Exclude with a CSV

When there are no common rules to identify duplicates across all or some records, you may need more granular control over which records to include or exclude from the process.

In these cases, you can use CSV files to customize your bulk merging, designate master records, and exclude records from deduplication. Discover how to customize merging duplicates in bulk using a CSV.

cus2.png

Additional Scenarios

Matching Using Two Different Fields

Sometimes you might want to match duplicates based on data in two separate fields. For example, you might want to compare an Email Address field to an Additional Email Addresses field to identify duplicates.

Using the Related Fields feature, you can use two different fields (that contain similar data) as matching fields to catch more duplicates.

You can set up Related Fields in the Advanced tab of Step 1.

To add related fields:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. 
  3. Click the Advanced tab.
  4. In the Related Fields dropdown, select up to 2 fields to group for matching.

merge-duplicates-hubspot-contacts-step-1-advanced-tab-additional-email-646px.png

Allowing Empty Values When Matching

When using two or more fields to identify duplicates, records can still be considered matches even if one of the field values is blank. You just need to specify which field(s) allow a blank value.

On the Simple tab of Step 1, set up your matching fields to identify potential duplicates, then click the Conditions tab.

allow-empty_2.png

All the matching fields you included will automatically appear when the Value Required in All Records condition is selected. Change the Condition to Empty Allowed in Any Record to allow empty values for certain fields. You can also use the At Least One Record with Non-Empty condition to help you determine which is the master record. Make sure at least one field remains required and is a reliable, unique identifier to ensure the records are really duplicates.

step-1-conditions-empty-not-empty.png

For example, on the Simple tab, you may have the matching fields: First Name, Last Name, and Phone Number. But on some of your records, the Phone Number field may be empty. Using the Empty Allowed in Any Record or At Least One Record with Non-Empty, all records with the same name, same phone number, and no phone number will be considered duplicates.

allow-empty-review.png

Find Duplicates Created within X Minutes of Each Other

When customers encounter issues while trying to complete a transaction, they often seek help from one of your support channels. However, whenever a contact is created from a chat, such as Facebook Messenger or Hubspot Chat, very little information is provided—usually just a name and a timestamp. This makes it difficult to find other instances of the same contact, such as their customer record.

To match records within a specific timeframe:

  1. Open the Merge Duplicates module
  2. On the Simple tab of Step 1, set up your matching fields to identify potential duplicates. 
  3. Click the Conditions tab.
  4. Select the Condition option, "Within Timeframe," and set the Minutes, Hours, or Days criteria. 

merge-duplicates-hubspot-contacts-step-1-conditions-created-within-15-minutes.png

Additional Resources

Related Help Articles

Related Blog Posts