Gigya Job Openings

IdentitySync

Skip to end of metadata
Go to start of metadata

 

Overview

IdentitySync is Gigya's robust ETL solution (Extract, Transform, Load) that offers an easy way to transfer data in bulk between platforms. 

With IdentitySync, you can:

  • Export: Take all the permission-based social and profile identity information stored at Gigya and channel it into another platform, such as an ESP, CRM, or marketing automation system.
  • Import: Get up-to-date data from a 3rd party platform, such as a user newsletter subscription status, survey responses or account balance, and sync existing Gigya user profiles or create new ones ad-hoc.
  • Transfer users from one Gigya site to another

IdentitySync is the engine that runs Gigya integrations with:

IdentitySync jobs can be carried out on a one-time basis, for example if migrating data, or they can be scheduled to run on a regular basis in order to keep your platforms synchronized.

Integrations

These are some of the integrations supported via IdentitySync: 

Use Cases

IdentitySync gives you the flexibility to use your data in any way you need. For example, with IdentitySync, the following scenarios are supported: 

  • Query the audit log to retrieve deleted users, and use a batch job to other external systems so that they can be deleted from there, for data compliance reasons.
  • Retrieve all accounts that have remained unverified or unregistered for over a week (isVerified==false or isRegistered==false and created>'one week ago'), export the relevant email addresses to an ESP, from which to send follow-up emails.
  • As a sports club, regularly import accounts from an external ticketing system, thus fortifying your fanbase. 
  • Set up data fields for segmenting users according to certain types of behavior in your site, then use an IdentitySync job to send only users that match these criteria to a marketing system for a targeted campaign. 
  • Use a Gigya-to Gigya IdentitySync job to query users by Facebook likes stored in their profiles - for example, people who like vampire and zombie related content - and to plant a value (e.g. "horrorFic") in a Gigya data field. Then launch a gruesome Halloween marketing campaign targeting these users. 
     

Main Features 

IdentitySync is incredibly flexible, and supports many technologies, source and target platforms, and data transformation. 

For full, up-to-date details of the service's capabilities, see the Component Repository.

Data Sources / TargetsSample TransformationsFile Formats
  • Gigya accounts and email accounts
  • Gigya Audit Log events including the Consent Vault
  • Gigya Data Store
  • FTP
  • SFTP
  • Amazon S3 cloud
  • Azure
  • ESP, DMP, CRM Platforms
  • Other platforms, using the generic API writer
  • Reordering, renaming, and removing fields
  • Replacing strings within field values using regex
  • Using JEXL expressions to create new fields based on the values of existing fields
  • Flattening objects and array field
  • PGP encryption and decryption (note that GPG is not supported)
  • MD5 and SHA2 password hashing
  • Change date formats
  • Other custom transformations, using IdentitySync Custom Scripts

Main file formats supported:

  • DSV
  • JSON
  • Salesforce, Krux and other formats
  • ZIP, GZIP, LZO

Building Blocks

Each IdentitySync job runs a dataflow. The building blocks of the dataflow are dedicated components. A component is a pre-configured unit that is used to perform a specific data integration operation. The components include readers, writers, transformers and lookups. Each component is responsible for performing a single task, such as:

  • Extracting accounts from Gigya based on specific parameters
  • Changing some field names
  • Creating a CSV file
  • Uploading a file to FTP
  • Writing data directly to a target platform or sending it to a generic API endpoint

Components can be added to the dataflow, removed or changed as needed.

For detailed information, visit the Component Repository and see sample Dataflow Templates.

 

The license could not be verified: License Certificate has expired!

Using IdentitySync

Object Model

 

Step

Steps are the building blocks of the dataflow. Each step is a call to a component that performs a specific task, such as extracting accounts from Gigya, or compressing a file in GZIP format. The step output is passed on to the next step in the dataflow for further processing. Each step includes the following attributes: 

  • id: the unique identifier of the step within a given dataflow (e.g., "Read from Gigya").
  • type: the component used in this step (see Component Repository) (e.g., "datasource.read.gigya.account").
  • params: a set of parameters for this step. 
  • next: an array containing one or more IDs of the next step(s) to be carried out.   
  • error: the next step to perform, in case of an error in the current step. For example, writing errors to a log. 

Dataflow

A dataflow is a series of steps, that comprises the complete definition for a transfer of information between Gigya and a third-party platform. A dataflow is also assigned a scheduling and may be executed once or repeatedly. 

Handling Errors

IdentitySync includes a built-in capability for separating failed records and writing them to a file, so that they may be reviewed and handled, and fed back into the flow. 

For detailed instructions, follow the implementation flow below (under Edit the Dataflow). For a code sample of a flow that writes failed records to SFTP, see the Component Repository

Note that IdentitySync jobs are scheduled in UTC time. Therefore, the platform participating in the flow should be set to the UTC timezone to ensure that file requests are handled properly.

Implementation

To create an integration based on IdentitySync, complete the following process:

 

Unable to render {include} The included page could not be found.

1. Create Dataflow

  1. Open the Dataflows in Gigya's Console. Make sure your are signed in and have selected the relevant site. The IdentitySync dashboard may also be accessed by clicking Settings in the upper menu and then Dataflows in the left menu.

  2. In the dashboard, click Create Data Flow

  3. In the Create Data Flow window, select the data flow integration from the dropdown. If the flow you wish to create is not available in the dropdown, select any available flow: it is customized in the next steps. For more information, see Dataflow Templates.

  4. Select the data flow template. Note that at the bottom of this window, you can see an outline of the flow that will be created (e.g., Account > rename > dsv > gzip >sftp). 

  5. Click Continue. As a result, the IdentitySync Studio screen opens in the dashboard. 

2. Edit the Data Flow

The data flow you created is built of the required steps for data transfer between Gigya and the selected vendor. Use the Component Repository to understand the structure and parameters required in each step.

Using IdentitySync Studio, you can:

  • Specify passwords, IDs, API keys etc. required for accessing each system and customer database.
  • Add the names of fields included in the data flow.
  • Flatten fields, remove non-ASCII strings, specify the compression type, parse in JSON or DSV format, etc. 
  • Map fields and extract array data, for example using field.array.extract.
  • Change the name of the data flow. 
  • Split a data flow, for example if you want to create two duplicate files and upload each file into a different destination. To do so, simply drag and drop the relevant step into the flow, and add connecting arrows as needed. In the code for the flow, this will be expressed in the next attribute, where you will find reference to the next two steps rather than just one. For a sample dataflow which employs this method, see the Epsilon Dataflow.
  • Add Custom Scripts.
  • Write failed records, that did not complete the flow successfully, to a separate file for review. 

To do so: 

  1. If it's more convenient, you can work in full screen mode by clicking the full-screen toggle on the top rigt corner. 

  2. Double-click any of the steps to add or edit its parameters. Click OK when finished. 
  3. To add a new step, start typing its name in the Search component box. Drag the step from the list of components into the canvas.
  4. Drag arrows from/to the new step and from/to existing steps, to include it in the correct place in the flow. Make sure the "Success path" arrow is selected, under Connector Type
  5. To add a custom step, locate the record.evaluate step in the list of components and drag it to the canvas. 
  6. Double click the custom step to open a JavaScript editor. Click Test script to validate the code. For a full explanation of custom steps, see IdentitySync Custom Scripts
  7. To split the data flow (for example to write to two target platforms), add the relevant step (e.g. another "write" step) and draw arrows accordingly: 

  8. Handling failed records: You can add additional steps after a "writer" step, for writing to a separate file the records that did not complete the flow successfully. To do so: 
    1. Add the relevant components to the flow (for example, a file.format step to write the records to a file, and a writer to write the file to the relevant destination). 
    2. Under Connector Type, select the "Error path" connector. 
    3. Draw a connection from the original writer, to which successful records will be written, to the next step that handles failed records (e.g., the file.format step). 

    4. Under Connector Type, select the "Success path". 

    5. Connect the next steps that handle the failed records (e.g., the writer) using the "Successful path" connector. 

  9. If necessary, click Source to review the data flow code , and edit the code as needed. 
  10. Click Save

Your dashboard should now look something like this: 



Actions

Click the ellipsis for the Actions menu. The following actions are available: 

IconActionDescription
EditOpens the current data flow in IdentitySync Studio and change any of its attributes, steps and parameters.
Run TestRuns the data flow once on 10 records for test purposes. If the test was successful, after refreshing the dashboard, you will see the timestamp of the test run under Last Successful Run. Use the Status button to view the details of the run. See Job History  section on this page.
Schedule

See Schedule the Dataflow.

DuplicateUseful for creating a new data flow based on a flow which has already been customized, if you wish to create a similar flow with slight variations.
StatusDisplays the status of the current jobs running in your IdentitySync configuration. See Job History section on this page.
DeleteDeletes this data flow.

 

3. Schedule the Dataflow

  1. Under Actions, select Scheduler
  2. Click Create Schedule
  3. Configure the schedule: 
    • Enter a schedule name
    • Change the start time as needed
    • Choose the log level: 
      • Error: Only error logs will be displayed in the job trace
      • Info: Info and error logs will be displayed in the trace
      • Debug: Besides info and error logs, each record will be logged between every 2 steps. This should be used only if the dataflow is not working as expected, and is limited to a batch of 3 records. 

        - The log level does not affect the step metrics and errors (see Test and Monitor below).

        - The job trace is limited to 1000 entries per job.

    • Choose whether to run once or at scheduled intervals
    • "Pull all records" should usually be selected only in the first run, when migrating records from one database to the other, and in any case should be used with caution. If the checkbox is not selected, and this is the first time this dataflow is run, records will be pulled according to the following logic: 
      • If the dataflow is set to run once, all records from the last 24 hours will be pulled. 
      • If the dataflow is recurring, records will be pulled according to the defined frequency. For example, if the dataflow is set to run once a week, the first time it is run, it will pull all records from the previous week. 
    • (Optional) Enter the email adress(es) for success and failure of the dataflow run. Use commas to enter a list of emails. 
    • (Optional) Limit to a specific number of records. This is usually used for test runs: when running a test from the dashboard, a one-time schedule is created which runs immediately for 10 records. 
  4. Click Create, and, once you are back in the Schedule dashboard, click the Refresh button. 
  5. The status of the scheduling is indicated in the Status column. 
  6. You can stop a job mid-run by clicking the Stop icon under Actions


  • The dashboard creates a Scheduling Object.
  • Last Successful Run corresponds with the lastRuntime parameter in the Dataflow Object
  • The Scheduled Next Run displays the newest date defined in the scheduler. Therefore, it's possible to see a past date here, if no more recent dates were configured.

Unable to render {include} The included page could not be found.

 

Test and Monitor

Test Run

Test the data flow by clicking  (run test) under Actions. This creates an immediate one-time run for 10 records. If the run was successful, after refreshing the dashboard (with the Refresh button) you will see its timestamp under Last Successful Run.


Notification Email

When scheduling the dataflow, you can enter email addresses to which a success and/or failure notification will be sent. We recommend adding idx-failure-jobs@gigya-inc.com to the list of failure notification email addresses, so that Gigya will receive feedback of system health. 

 

Job History

You can monitor data flows by reviewing previous runs (jobs). The job history displays the status of each run, its start and end times, and the number of records for which the data flow was completed successfully (under Processed). 

Under Actions, click the Status button  to open the Job History screen.


 

For advanced monitoring and debugging, click the info icon for the relevant job under Details, and the Job Status Details screen opens.

Note the tabs that display the following detailed information: 

  • Trace: Contains a detailed trace of the job execution, including the log level and timestamp of each log message. 

    The job trace is limited to 1000 records. The log level is defined when scheduling the job.

  • Step metrics: Displays the following metrics for each step: Duraion, Input, Output and Errors. Using step metrics, you can find out what were the bottlenecks of a job that took a long time to run, review performance issues, and monitor the number of records that completed the flow.  
  • Errors: Displays details of the errors that occurred during the job execution.

Copying Accounts From One Site to Another

IdentitySync gives you the option of copying the account database from one Gigya site to another, using the read from Gigya and write to Gigya components. When doing so:

  • The source and target sites should belong to the same data center. 
  • Make sure all the fields being written to the target site exist on that site's schema. 
  • Policies should identical on both sites. The email verification policy needs to be disabled on the target site for the import to work (this one policy may be configured differently on the source site).
  • Schema and site configuration should be completed prior to the user import. An easy way to do this is using the Configuration Copy Tool
  • The job should be set up on the target site. 
  • On the source site, you should create an application, and use the credentials in the dataflow. We recommend using a high-rate application key. For more information, see Signing Requests to SAP Customer Data Cloud
  • When setting up the import job, update the "account" step to include the source API key, the user key and secret key. . 

Any operation involving importing accounts can be quite complex. if your use-case does not fall within the parameters defined above, SAP strongly recommends contacting your Customer Engagement Executive to scope an engagement with SAP’s digital services consulting team.

 Click to view a sample dataflow for copying site accounts
{
"name": "Copy Accounts",
"description": "account > rename > importAccount",
"steps": [
  {
   "id": "account",
   "type": "datasource.read.gigya.account",
   "params": {
    "select": "UID,created,data,emails,identities,isActive,isRegistered,isVerified,lastLogin,lastUpdated,loginIDs,password,preferences,profile,registered,regSource,subscriptions,verified,lang",
    "from": "accounts",
    "deltaField": "lastUpdatedTimestamp",
    "keepFieldNamesWithDotAsIs": false,
    "batchSize": 300,
    "maxConcurrency": 1,
    "apiKey": "<the-source-API-key>",
    "userKey": "<a-high-rate-application-key-that-has-read-permissions-on-the-source-API-key>",
    "secret": "<the-corresponding-application-secret-key>"
   },
   "next": [
    "rename"
   ]
  },
  {
   "id": "rename",
   "type": "field.rename",
   "params": {
    "fields": [
     {
      "sourceField": "UID",
      "targetField": "uid"
     },
     {
      "sourceField": "password.hash",
      "targetField": "password.hashedPassword"
     }
    ]
   },
   "next": [
    "importAccount"
   ]
  },
  {
   "id": "importAccount",
   "type": "datasource.write.gigya.importaccount",
   "params": {
    "importPolicy": "insert",
    "handleIdentityConflicts": "error",
    "maxConnections": 3,
    "addResponse": false
   },
   "next": []
  }
]
}

 

Unable to render {include} The included page could not be found.

 

IP Whitelisting

Depending on your networking policies, you may have to add the IPs of IdentitySync servers to a whitelist in order to allow IdentitySync to upload/pull information.

The full list of Gigya IPs are listed here. In addition, the following addresses are related to IdentitySync:

EU Data Center:

  • 46.51.204.12
  • 54.76.191.69

US Data Center:

  • 52.204.240.189

  • 18.204.248.129

AU Data Center:

  • 54.66.139.77
  • 54.66.141.200

CN Data Center:

  • 101.132.236.215

RU Data Center

  • 95.213.253.43
  • 95.213.238.43

Save

Save

Save

Save

Save

Save



{ 

"name": "Copy Accounts", 

"description": "account > rename > importAccount", 

"steps": [ 

  { 

   "id": "account", 

   "type": "datasource.read.gigya.account", 

   "params": { 

    "select": "UID,created,data,emails,identities,isActive,isRegistered,isVerified,lastLogin,lastUpdated,loginIDs,password,preferences,profile,registered,regSource,subscriptions,verified,lang", 

    "from": "accounts", 

    "deltaField": "lastUpdatedTimestamp", 

    "keepFieldNamesWithDotAsIs": false, 

    "batchSize": 300, 

    "maxConcurrency": 1, 

    "apiKey": "<the-source-API-key>", 

    "userKey": "<a-high-rate-application-key-that-has-read-permissions-on-the-source-API-key>", 

    "secret": "<the-corresponding-application-secret-key>" 

   }, 

   "next": [ 

    "rename" 

   ] 

  }, 

  { 

   "id": "rename", 

   "type": "field.rename", 

   "params": { 

    "fields": [ 

     { 

      "sourceField": "UID", 

      "targetField": "uid" 

     }, 

     { 

      "sourceField": "password.hash", 

      "targetField": "password.hashedPassword" 

     } 

    ] 

   }, 

   "next": [ 

    "importAccount" 

   ] 

  }, 

  { 

   "id": "importAccount", 

   "type": "datasource.write.gigya.importaccount", 

   "params": { 

    "importPolicy": "insert", 

    "handleIdentityConflicts": "error", 

    "maxConnections": 3, 

    "addResponse": false 

   }, 

   "next": [] 

  } 

] 

} 

 

  • No labels