IdentitySync (IDX) is Gigya's ETL solution (Extract, Transform, Load) that offers an easy way to transfer data in bulk between platforms. Use it to transfer user data from Gigya to a third-party platform or vice versa, or even from one Gigya database to another.
With IdentitySync, customers can:
- Take all the permission-based social and profile information that Gigya stores about the customer's users and channel it into another platform, such as an ESP, CRM, or marketing automation system.
- Get up-to-date data from the other platform and import it into Gigya, such as newsletter subscription status, survey responses or account balance.
IdentitySync jobs can be carried out on a one-time basis, for example if migrating data, or they can be scheduled to run on a regular basis in order to keep your platforms synchronized.
Each IdentitySync job runs a dataflow. The building blocks of the dataflow are dedicated scripts. Each script is responsible for performing a single task, such as:
- Extracting accounts from Gigya based on specific parameters
- Changing some field names
- Creating a CSV file
- Uploading a file to a given FTP
Scripts can be added to the dataflow, removed or changed as needed.
The following chart is a visualization of a dataflow in IdentitySync. This dataflow exports user accounts from Gigya to a partner platform (Krux).
Each step in the dataflow runs a separate script.
The above flow demonstrates a split dataflow. Dataflows are split using the next parameter. For more details, see Customize the Dataflow.
For full, up-to-date details of the service's capabilities, see the Script Repository.
|Main Supported Data Sources/Targets||Sample Transformations||Main Output Formats|
Output formats supported:
The IDX API
A dataflow is a complete definition for a transfer of information between Gigya and a third-party platform in a specific direction.
The dataflow includes all the necessary information about where the data is extracted from, which data is extracted, how the data is processed, and where the data is transmitted.
For details, see Dataflow Object
Steps are the building blocks of the dataflow. Each step is a call to a script that performs a specific task, such as extracting accounts from Gigya, or compressing a file in GZIP format. The step calls the script with specific parameters, and the output is passed on to the next step in the dataflow for further processing. For example, "datasource.read.gigya.account" is a script that searches the Gigya account database and returns all accounts that match specific parameters (an SQL-like search query). This script will typically be called in the first step in a dataflow that exports accounts from Gigya to a partner platform. Each step includes the following attributes:
- id: the unique identifier of the step within a given dataflow. Each step has to have an ID so it can be called by other steps in the "next" attribute.
- type: the ID of the IdentitySync script used in this step (see Script Repository).
- params: an object defining the parameters passed in this IdentitySync script.
- next: an array containing one or more IDs of the next step(s) to be carried out.
- A step to which no other step refers in the next attribute is automatically considered the entry point of the dataflow.
- Steps which do not have a next attribute are automatically considered end-points of the dataflow.
- Assign multiple values to a next attribute to split the dataflow. See example.
To actually run a dataflow, a scheduling object needs to be created. The scheduling specifies whether the dataflow should run once or on a repeating basis, how often, and when it should begin.
For details, see Scheduling Object.
The job status object contains the status of a data transfer job that has been scheduled to run. You will retrieve this object in order to check whether the job is running, has finished successfully, or has failed.
For details, see JobStatus object.
To create an integration based on IdentitySync, complete the following process:
1. Arrange Permissions
Partner ID Permissions
Before IdentitySync can work with any specific partner ID, it needs to be given permissions to the partner.
- If you don't have the necessary permissions (extended permissions) to the partner ID, request them.
If you have permissions to do so, call admin.updateGroup. If you do not, open a SalesForce case requesting to add this partner to the IDX system. Include the following details in your request:
|partnerID||The partner ID.|
|groupID||_idx_application_viewers - for jobs that read data from Gigya|
_idx_application_editors - for jobs that write data to Gigya
2. Create Dataflow
Open ETL Data Flows in Gigya's website. Make sure your are signed in and have selected the relevant site. The IdentitySync dashboard may also be accessed by clicking Settings in the upper menu and then ETL Dashboard in the left menu.
In the dashboard, click Create Data Flow.
In the Create Data Flow window, select the data flow integration from the dropdown. Currently, only SFTP integrations are available in this dropdown. In any case you can change the dataflow to integrate with a different system, when customizing the dataflow. For more information, see Dataflow Templates
Select the dataflow template: the direction of the flow, whether from or into Gigya. Note that at the bottom of this window, you can see an outline of the flow that will be created (e.g., Account > rename > dsv > gzip >sftp).
Click Continue. As a result, the Create Data Flow screen opens in the dashboard.
Relevant IDX API Methods:
3. Customize the Dataflow
The dataflow already contains a basic flow for data transfer between Gigya and the selected vendor. Use the Script Repository to understand the structure and parameters required in each step, and to add or remove steps as needed. For example:
- Specify passwords, usernames, IDs, API keys etc. required for accessing each system and customer database.
- Add the fields your customer wishes to transfer between databases
- Flatten fields, remove non-ASCII strings, specify the compression type, parse in JSON or DSV format, etc.
- Change the name and/or description of the data flow itself, at the very beginning of the flow.
Finally, click Create Data Flow.
Your dashboard should now look something like this:
The following actions are available:
|Edit||Opens the current data flow in the Edit Data Flow window and change any of its attributes, steps and parameters using the Script Repository.|
|Run Test||Runs the data flow once on 10 records for test purposes. If the test was successful, after refreshing the dashboard, you will see the timestamp of the test run under Last Successful Run. For a detailed report of the test results, at this current stage of development, use idx.search and retrieve the JobStatus object. Later on, job status reports will be available within the dashboard.|
|Duplicate||Useful for creating a new data flow based on a flow which has already been customized, if you wish to create a similar flow with slight variations.|
To split a dataflow, in edit mode, simply reference the next two steps, instead of the usual one, in the next attribute. For a sample dataflow which employs this method, see the Epsilon Dataflow.
4. Schedule the Dataflow
- Under Actions, click (schedule) to open the Scheduler.
- Click Create Schedule.
- Configure the schedule:
- Enter a schedule name
- Change the start time as needed
- Choose whether to run once or at scheduled intervals
- (Optional) Enter the email adress(es) for success and failure of the dataflow run.
- (Optional) Limit to a specific number of records. This is usually used for test runs: when running a test from the dashboard, a one-time schedule is created which runs immediately for 10 records.
- Click Create, and, once you are back in the Schedule dashboard, click the Refresh button.
Relevant IDX API Methods:
5. Test and Monitor
- Run a test by clicking (run test) under Actions. This creates an immediate one-time run for 10 records. If the run was successful, after refreshing the dashboard (with the Refresh button) you will see its timestamp under Last Successful Run.
- For a full report of the job status (including errors, if there were any) use the idx.search method (SELECT * FROM idx_job_status WHERE...).
Depending on the customer's networking policies, they may have to add the IPs of IdentitySync servers to a whitelist in order to allow Gigya to upload/pull information from the customer.
The relevant IPs are:
EU1 Data Center:
US Data Center:
AU Data Center: