Here are some frequently asked questions about Smart Ingest.
In this article
How are duplicate records handled during a sync?
There are different kinds of duplicate records. During syncs, they’re managed differently.
-
Duplicate records in the data model, based on the table’s primary key.
During a sync run, if Smart Ingest detects that the data model has two records with the same primary key, Iterable doesn’t import either record. Because Smart Ingest has no way of knowing which is the preferred version of a duplicated record, it's safer to accept neither.
Generally, this doesn’t happen if the selected primary key is a true primary key — in which case, the database would enforce uniqueness.
-
Duplicate records in the data model, based on the unique identifier used by Iterable.
The
email
field is the unique user identifier in email-based projects, but that field isn’t typically used as a primary key in a database table. If you don’t enforce the uniqueness of the field in your data source that contains email addresses, then it’s possible to have two different records with distinct primary key values and share the same email address.During a sync run, Smart Ingest sends these records to Iterable, and then Iterable updates the single user record with
email
. The same user is then updated multiple times, leading to issues like user profile data being overwritten and fewer records in Iterable than synced.This action might be desirable, for instance, when you're updating an existing record in Iterable with data that appears as a new row in your data model, such as transactions that add new rows to a table. This is likely undesirable if each row in your data model is intended to represent a unique user, and don't want to overwrite existing records from previous rows.
Why does Iterable need write access to our data source?
By default, Smart Ingest syncs use the Basic sync engine, which only needs read access to your data source. In these cases, change data capture occurs in Smart Ingest infrastructure.
Iterable only needs write access to your data source if you have large data syncs that require the Lightning sync engine for optimal performance.
In the event that a large sync is beyond certain thresholds, the Lightning sync engine provides optimal performance. In these cases, change data capture occurs in the data source, and resources are distributed in a manner that optimizes your sync performance.
To learn more about what Smart Ingest does with write permissions, read Optimizing Smart Ingest Sync Performance.