Smart Ingest lets you pull data stored in your Google BigQuery warehouse and send it to Iterable.
NOTE
Contact your Iterable customer success manager to discuss adding Smart Ingest to your plan.
Smart Ingest can only import data from Google BigQuery.
In this article
Connection requirements
To connect with Smart Ingest, you need the following items:
-
A Google Cloud Platform (GCP) service account with access to the data you’re importing. You can either:
- Bring your own with a JSON key.
- Create one directly from Smart Ingest.
-
Project ID – The unique identifier of the Google Cloud project for your BigQuery warehouse.
- Billing must be enabled.
- BigQuery API must be enabled.
-
Dataset location (the region where your BigQuery dataset is located)
Before you connect Smart Ingest to BigQuery, make sure to allow Smart Ingest IPs in your Google Cloud account.
Connecting Google BigQuery to Smart Ingest
Step 1: Create a service account
To connect Smart Ingest to Google BigQuery, you need a Google Cloud Platform (GCP) service account. This account is used to authenticate Smart Ingest with BigQuery and access your data.
You can create a service account for Google BigQuery in two different ways:
-
In Smart Ingest, you can create a service account that’s managed by our
partner, Hightouch. You still need to visit Google Cloud Platform and
assign roles to the account. -
Bring your own service account
Create your own service account in Google Cloud Platform.
Create a managed service account
When you use Smart Ingest to create a new service account, you can connect to BigQuery with a new service account that’s managed by our partner, Hightouch. The service account is unique to your workspace and this credential.
To create a managed service account:
Log in to Iterable as a user with the Manage Integrations project permission and open the project you’re working on.
Go to Integrations > Smart Ingest.
Click Connect a New Source.
Select Google BigQuery and click Connect.
Under Step 1, find GCP Credentials and click New Credentials.
Enter a Display Name.
-
Select Service account managed by Hightouch.
Click Create a New Service Account.
-
Click the Copy button to copy the service account address.
Example:
ht-some-string@hightouch.iam.gserviceaccount.com
IMPORTANT
You cannot view this service account address later. Make sure you’ve copied the address before leaving this screen.
Click Create
Now you can use this GCP service account to connect BigQuery and Smart Ingest.
Smart Ingest saves these credentials for later, so if you’re not ready to connect then you can leave this page and come back later to finish connecting.
Bring your own service account
To use your own service account to connect Smart Ingest to BigQuery, you need to first create a service account in GCP:
Create the account. For instructions, read Create Service Accounts (Google Cloud).
Generate a JSON key file. For instructions, read Create and Delete Service Account Keys (Google Cloud).
Next, to add your service account to Iterable for use in Smart Ingest:
Log in to Iterable as a user with the Manage Integrations project permission and open the project you’re working on.
Go to Integrations > Smart Ingest.
Click Connect a New Source.
Select Google BigQuery and click Connect.
Under Step 1, find GCP Credentials and click New Credentials.
Enter a Display Name.
-
Select Bring Your Own Service Account.
Drag and drop the service account JSON key file into the import area, or click on the area to select the file from its location.
Smart Ingest then parses the key, and if it's valid, displays the Client ID and Client Email fields. Verify that these are correct.
Click Create.
Now you can use this GCP service account to connect BigQuery and Smart Ingest.
Smart Ingest saves these credentials for later, so if you’re not ready to connect then you can leave this page and come back later to finish connecting.
Step 2: Grant access roles to the service account
Once you create a service account, the account needs to have the correct permissions to access your BigQuery data. To do this, you need to assign roles to the service account in Google Cloud Platform.
By default, your GCP service account doesn't have permission to read data from BigQuery. You can set up your service account to have full access to your project using a predefined role. Otherwise, you can create a custom role and provide limited access according to a user-specified list of permissions.
Grant full access
To grant full access, assign the bigquery.user
and bigquery.dataViewer
roles to your service account. You can do this in the Google Cloud web console,
or by running these snippets, shown below, in the Cloud Shell.
To grant permission to read metadata and list tables:
gcloud projects add-iam-policy-binding <YOUR_PROJECT_NAME> \ --member serviceAccount:<YOUR_SERVICE_ACCOUNT> \ --role roles/bigquery.user
To grant permission to read data from tables and views:
gcloud projects add-iam-policy-binding <YOUR_PROJECT_NAME> \ --member serviceAccount:<YOUR_SERVICE_ACCOUNT> \ --role roles/bigquery.dataViewer
Grant limited access
To provide limited access, assign the bigquery.dataViewer
role only to the
specific datasets, tables, or views you want to use in Smart Ingest.
Because you’re assigning the bigquery.dataViewer
role only to specific
resources, you need to assign the bigquery.user role and grant the bigquery.tables.get
permission at the project level. For this, you can create a custom role
in the Google Cloud web console based on an existing predefined role
(bigquery.user), which you can name custom.bigquery.user
. When setting up the
custom role, click Add permissions to add the bigquery.tables.get
permission
to this custom role. Then, assign this role to your service account at the project
level. Smart Ingest needs to be able to list the schemas and tables in your BigQuery
project when creating a model.
Therefore, the source connection test
fails if you don't grant the bigquery.tables.get permission at the project level.
You can do this in the Google Cloud web console, or by running this snippet in the Cloud Shell:
gcloud projects add-iam-policy-binding <YOUR_PROJECT_NAME> \ --member serviceAccount:<YOUR_SERVICE_ACCOUNT> \ --role roles/custom.bigquery.user
NOTE
If this custom.bigquery.user
role still isn't limited enough, you can try
assigning the bigquery.job
user role and granting the bigquery.dataset.get
, bigquery.tables.get
,
and bigquery.tables.list
permissions at the project level.
You can then decide which datasets, tables, or views your GCP service account
has access to by granting access to a resource
in the Google Cloud web console. For every resource you would like to use in
Smart Ingest, select your BigQuery service account as the Principal and the
bigquery.dataViewer
role as the Role.
Smart Ingest lists all tables in your BigQuery project when creating a model
using the table selector.
However, Smart Ingest can only query data from tables assigned the
bigquery.dataViewer
role. Tables that weren't assigned this role return an
error if you attempt to query them.
Step 3: Connect Smart Ingest to BigQuery
To connect Smart Ingest with BigQuery:
Log in to Iterable as a user with the Manage Integrations project permission and open the project you’re working on.
Go to Integrations > Smart Ingest.
Click Connect a New Source.
Select Google BigQuery and click Connect.
-
In Step 1, find GCP Credentials and either select credentials you’ve already created from the drop-down menu, or click New Credentials.
If this is your first time connecting and you’re adding new credentials, read our instructions to either create a managed service account or to bring your own service account.
-
In Step 2:
- Enter the Project ID for a project with the BigQuery API enabled.
- Select the Dataset location. (This is the project’s Data location in GCP.)
-
In Step 3, choose from the Lightning or Basic sync engine. Lightning is faster and more efficient, but requires the database user to have write access. Basic is slower and less efficient, but doesn’t require write access.
To view the additional permissions required for the database user, select the Lightning sync engine. These permissions are necessary for Smart Ingest to create and manage the sync schema, and are customized based on the other inputs you provide in the source setup form.
To learn more about sync engines, read Optimizing Smart Ingest Sync Performance.
NOTE
Smart Ingest is co-developed by Iterable and Hightouch. Hightouch is a data processor for this feature. Smart Ingest data operations and schemas may contain the Hightouch name, but the feature is fully supported by Iterable.
Click Continue. Smart Ingest automatically tests the connection.
When the connection test is successful, click Continue. (If there are problems connecting, click Back and review the connection details for accuracy.)
Add a name for your data source. This is for display in Iterable.
Click Finish.
Next steps
You've now connected your data warehouse to Smart Ingest. The next thing you can do is create a sync.
Troubleshooting BigQuery
Hashed values are displayed as objects of bytes
When previewing your model or when syncing your data to the destination, you might see an object of bytes, such as:
{"0":123,"1":456,"2":789,"3":012,...}
This happens if you use certain hashing algorithms in BigQuery to hash your data, such as SHA256.
To resolve this, open the SQL editor and use TO_HEX
or TO_BASE64
functions to query this model column:
SELECT TO_HEX(model_column_name) as string_model_column_name, * FROM dataset.table
No matching signature for operator ... for argument types: ...
An example of the error message is:
No matching signature for operator != for argument types: DATE, TIMESTAMP.
This error means that your data model is using the named operator, for example,
!=
, to compare two incompatible types, for example, a date to a timestamp. To
resolve the issue, ensure your model columns have the correct data types,
or revise your model query.
Unable to process number due to big.js
An example of the error message is:
Unable to process number due to big.js: Imprecise conversion: big.js Imprecise conversion.
This error occurs when one or more values have a very high precision, for example 123.12345678910. Smart Ingest attempts to convert the value to a number (based on the data type selection of your model). An error occurs if the number can't be precisely represented by a JavaScript number.
To resolve the issue, ensure the values in your model columns have a lower precision. Additionally, if your value doesn't need to be a number datatype, casting the value to a string datatype can also resolve this issue.