IdentitySync

Skip to end of metadata
Go to start of metadata

 

Overview

IdentitySync is Gigya's ETL solution (Extract, Transform, Load) that offers an easy way to transfer data in bulk between platforms. Use it to transfer user data from Gigya to a third-party platform or vice versa, or even from one Gigya database to another.

With IdentitySync, you can:

  • Take all the permission-based social and profile identity information stored at Gigya and channel it into another platform, such as an ESP, CRM, or marketing automation system.
  • Get up-to-date data from a 3rd party platform, such as newsletter subscription status, survey responses or account balance, and import it into Gigya.

IdentitySync jobs can be carried out on a one-time basis, for example if migrating data, or they can be scheduled to run on a regular basis in order to keep your platforms synchronized.

IdentitySync APIs use the idx namespace. See Using the IDX API below for details.

IdentitySync is a premium platform that requires separate activation. If it is not part of your site package please contact your Gigya Account Manager or contact us by filling in a support form on our site. You can also access the support page by clicking "Support" on the upper menu of Gigya's site.

Building Blocks

Each IdentitySync job runs a dataflow. The building blocks of the dataflow are dedicated components. A component is a pre-configured unit that is used to perform a specific data integration operation. The components include readers, writers, transformers and lookups. Each component is responsible for performing a single task, such as:

  • Extracting accounts from Gigya based on specific parameters
  • Changing some field names
  • Creating a CSV file
  • Uploading a file to a given FTP

Components can be added to the dataflow, removed or changed as needed.

For detailed information, visit the Component Repository and see sample Dataflow Templates.

Dataflow Example

The following chart is a visualization of a dataflow in IdentitySync. This dataflow exports user accounts from Gigya to a partner platform (Krux).

Each step in the dataflow runs a separate component.

 

The above flow demonstrates a split dataflow. Dataflows are split using the next parameter. For more details, see Customize the Dataflow.

Click to view a sample JSON dataflow:

.

Use Cases

IdentitySync gives you the flexibility to use your data in any way you need. For example, with IdentitySync, the following scenarios are supported: 

Admin Activities

  • Retrieve all accounts that have remained unverified or unregistered for over a week (isVerified==false or isRegistered==false and created>'one week ago'), export the relevant email addresses to an ESP, from which to send follow-up emails.
  • As a sports club, regularly import accounts from an external ticketing system, thus fortifying your fanbase. 
  • Query the audit log to retrieve deleted users, and use a batch job to other external systems so that they can be deleted from there, too.

User Segmentation and Progressive Profiling

  • Set up data fields for segmenting users according to certain types of behavior in your site - such as Loyalty interactions, purchases or page visits (people who like and share content related to tabi socks, or have purchased said socks) then use an IdentitySync job to send only users that match these criteria to a marketing system for a targeted campaign (50% off in our summer sock sale). 
  • Use a Gigya-toGigya IdentitySync job to query users by Facebook likes stored in their profiles - for example, people who like vampire and zombie related content - and to plant a value (e.g. "horrorFic") in a Gigya data field. Then launch a gruesome Halloween marketing campaign targeting these users. 
  • Use an IdentitySync job to initialize default data to a Boolean field (e.g. set to null), and using a custom component that is activated according to this value, trigger a progressive profiling screen that requests more information from site visitors. 

Main Features 

For full, up-to-date details of the service's capabilities, see the Component Repository.

Main Supported Data Sources/TargetsSample TransformationsMain File Formats
  • Gigya Accounts
  • Gigya Data Store
  • FTP
  • SFTP
  • Amazon S3 cloud
  • Reordering, renaming, and removing fields
  • Replacing strings within field values using regex
  • Using JEXL expressions to create new fields based on the values of existing fields
  • Flattening objects (with some limitations)
  • Flattening an array field into a string field
  • PGP encryption and decryption

Main file formats supported:

  • DSV
  • JSON
  • Salesforce, Mailchimp, Krux and other formats
  • GZIP, LZO

Using IdentitySync

Object Model

Dataflow

A dataflow is a complete definition for a transfer of information between Gigya and a third-party platform in a specific direction.

The dataflow includes all the necessary information about where the data is extracted from, which data is extracted, how the data is processed, and where the data is transmitted.

For details, see Dataflow Object. For samples, see Dataflow Templates

Step

Steps are the building blocks of the dataflow. Each step is a call to a component that performs a specific task, such as extracting accounts from Gigya, or compressing a file in GZIP format. The step calls the component with specific parameters, and the output is passed on to the next step in the dataflow for further processing. For example, "datasource.read.gigya.account" is a component that searches the Gigya account database and returns all accounts that match specific parameters (an SQL-like search query). This component will typically be called in the first step in a dataflow that exports accounts from Gigya to a partner platform. Each step includes the following attributes: 

  • id: the unique identifier of the step within a given dataflow. Each step has to have an ID so it can be called by other steps in the "next" attribute.
  • type: the ID of the IdentitySync component used in this step (see Component Repository).
  • params: an object defining the parameters passed in this IdentitySync component. 
  • next: an array containing one or more IDs of the next step(s) to be carried out. 
    • A step to which no other step refers in the next attribute is automatically considered the entry point of the dataflow. 
    • Steps which do not have a next attribute are automatically considered end-points of the dataflow. 
    • Assign multiple values to a next attribute to split the dataflow. See example
Step Structure
{ 
         "id":"dsv", // A name you assign to this step that serves as a unique identifier in this dataflow.
         "type":"file.parse.dsv", // The component run in this step.
         "params":{ // The parameters and values, expressions etc. used in this step.
            "columnSeparator":","
         },
         "next":[ // The next step to be run once this step completes.
            "rename"
         ]
      },

 

For examples of end-to-end dataflows, see  Export from Gigya to SFTP and Import from SFTP to Gigya.

Data File Example

The following is an example of a data file in DSV format. The quotes around each field can be removed.

"UID","email","firstname"
"_gid_XeCEe4oZYgvn83np9DPA+g==","sample_mail@something.com","John"
"_gid_XeCEe4oZYgvn83np9DPA+g==","","Jane"

Implementation Flow

Note that IdentitySync jobs are scheduled in UTC time. Therefore, the platform participating in the flow should be set to the UTC timezone to ensure that file requests are handled properly.

To create an integration based on IdentitySync, complete the following process:

1. Arrange Permissions

Console Permissions

Note: IdentitySync is a premium platform that requires separate activation. If it is not part of your site package please contact your Gigya Account Manager or contact us by filling in a support form on our site. You can also access the support page by clicking "Support" on the upper menu of Gigya's site.

Partner ID Permissions

Before IdentitySync can work with any specific partner ID, it needs to be given permissions to the partner.

  1. If you don't have the necessary permissions (extended permissions) to the partner ID, request them.
  2. If you have permissions to do so, call admin.updateGroup. If you do not, open a SalesForce case requesting to add this partner to the IDX system. Include the following details in your request:

 Value
partnerIDThe partner ID.
groupID_idx_application_viewers - for jobs that read data from Gigya
_idx_application_editors - for jobs that write data to Gigya


2. Create Data Flow

  1. Open IdentitySync Data Flows in Gigya's Console. Make sure your are signed in and have selected the relevant site. The IdentitySync dashboard may also be accessed by clicking Settings in the upper menu and then IdentitySync Data Flows in the left menu.

  2. In the dashboard, click Create Data Flow

  3. In the Create Data Flow window, select the data flow integration from the dropdown. If the flow you wish to create is not available in the dropdown, select any available flow: it is customized in the next steps. For more information, see Dataflow Templates.

  4. Select the data flow template: the direction of the flow, whether from or into Gigya. Note that at the bottom of this window, you can see an outline of the flow that will be created (e.g., Account > rename > dsv > gzip >sftp). 

  5. Click Continue. As a result, the Create Data Flow screen opens in the dashboard, in Graph display. 

Relevant IDX API Methods: 

3. Customize the Data Flow

The data flow you created is built of the required steps for data transfer between Gigya and the selected vendor. Use the Component Repository to understand the structure and parameters required in each step.

  1. Double-click any of the steps to add or edit its parameters. Click OK when finished. 
  2. Click Source to review the data flow as a series of code blocks, and edit the code as needed. 
  3. Click Create Data Flow

Using the editor:

  • Specify passwords, IDs, API keys etc. required for accessing each system and customer database.
  • Add the names of fields included in the data flow.
  • Flatten fields, remove non-ASCII strings, specify the compression type, parse in JSON or DSV format, etc. 
  • Change the name and/or description of the data flow itself, at the very beginning of the flow. 
  • Split a data flow, for example if you want to create two duplicate files and upload each file into a different destination. To do so, in the next attribute, reference the next two steps rather than just one. For a sample dataflow which employs this method, see the Epsilon Dataflow.

Finally, click Create Data Flow.

Your dashboard should now look something like this: 

Actions

The following actions are available: 

IconActionDescription
EditOpens the current data flow in the Edit Data Flow window and change any of its attributes, steps and parameters using the Component Repository.
Run TestRuns the data flow once on 10 records for test purposes. If the test was successful, after refreshing the dashboard, you will see the timestamp of the test run under Last Successful Run. For a detailed report of the test results, at this current stage of development, use idx.search and retrieve the JobStatus object. Later on, job status reports will be available within the dashboard.
Schedule

See Schedule the Dataflow.

DuplicateUseful for creating a new data flow based on a flow which has already been customized, if you wish to create a similar flow with slight variations.
StatusDisplays the status of the current jobs running in your IdentitySync configuration.

 

4. Schedule the Dataflow

  1. Under Actions, click (schedule) to open the Scheduler. 
  2. Click Create Schedule
  3. Configure the schedule: 
    • Enter a schedule name
    • Change the start time as needed
    • Choose whether to run once or at scheduled intervals
    • "Pull all records" should usually be selected only in the first run, when migrating records from one database to the other, and in any case should be used with caution. If the checkbox is not selected, and this is the first time this dataflow is run, records will be pulled according to the following logic: 
      • If the dataflow is set to run once, all records from the last 24 hours will be pulled. 
      • If the dataflow is recurring, records will be pulled according to the defined frequency. For example, if the dataflow is set to run once a week, the first time it is run, it will pull all records from the previous week. 
    • (Optional) Enter the email adress(es) for success and failure of the dataflow run. 
    • (Optional) Limit to a specific number of records. This is usually used for test runs: when running a test from the dashboard, a one-time schedule is created which runs immediately for 10 records. 
  4. Click Create, and, once you are back in the Schedule dashboard, click the Refresh button. 
  • The dashboard creates a Scheduling Object.
  • Last Successful Run corresponds with the lastRuntime parameter in the Dataflow Object
  • The Scheduled Next Run displays the newest date defined in the scheduler. Therefore, it's possible to see a past date here, if no more recent dates were configured.

Relevant IDX API Methods: 

 

Test and Monitor

Test Run

Test the data flow by clicking  (run test) under Actions. This creates an immediate one-time run for 10 records. If the run was successful, after refreshing the dashboard (with the Refresh button) you will see its timestamp under Last Successful Run.

 

Job History

You can monitor data flows by reviewing previous runs (jobs). The job history displays the status of each run, its start and end times, and the number of records for which the data flow was completed successfully (under Processed). 

Under Actions, click the Status button  to open the Job History screen.


 

For advanced debugging, click the info icon for the relevant job under Details, and the Job Status Detail screen opens.

 

You can use the idx.search method to receive job history data.

 

IP Whitelisting

Depending on your networking policies, you may have to add the IPs of IdentitySync servers to a whitelist in order to allow IdentitySync to upload/pull information.

The relevant IPs are:

EU Data Center:

  • 46.51.204.12
  • 54.76.191.69

US Data Center:

  • 52.73.203.224
  • 184.72.109.195

AU Data Center:

  • 54.66.139.77
  • 54.66.141.200

CN Data Center:

  • 139.196.87.192

RU Data Center

  • Unsupported

Save

Save

Save

Save

Save

Save


  • No labels