Component Repository

Skip to end of metadata
Go to start of metadata

Overview

In IdentitySync, a component is a pre-configured unit that is used to perform a specific data integration operation. The components include readers, writers, transformers and lookups. This document lists the available components in IdentitySync. Use these components when defining Dataflows.

Connectors

The components of a data flow are connected by adding a 'next' indicator, or, when creating a flow in IdentitySync Studio, by adding a "Success path" between them. You can also add an "Error path" connector, for different handling of records that failed to complete the flow successfully. In the source code of the data flow, this is done by adding an 'error' reference after a given component. 

An 'error' indicator can only be added after a datasource writer.

 Error_Handling_Code_Sample
{
    "name": "Import Accounts",
    "description": "Import accounts from SFTP to Gigya",
    "steps": [{
            "id": "sftp - reader",
            "type": "datasource.read.sftp",
            "params": {
                "host": "...",
                "username": "...",
                "password": "...",
                "fileNameRegex": "accounts.json",
                "port": 22,
                "sortBy": "time",
                "sortOrder": "ASC"
            },
            "next": ["parse"]
        }, {
            "id": "parse",
            "type": "file.parse.json",
            "next": ["rename"]
        }, {
            "id": "rename",
            "type": "field.rename",
            "params": {
                "fields": [{
                        "sourceField": "password.compoundHash",
                        "targetField": "compoundHashedPassword"
                    }, {
                        "sourceField": "profile.email",
                        "targetField": "email"
                    }
                ]
            },
            "next": ["writer"]
        }, {
            "id": "writer",
            "type": "datasource.write.gigya.generic",
            "params": {
                "apiMethod": "accounts.importAccount",
                "apiParams": [{
                        "sourceField": "UID",
                        "paramName": "UID"
                    }, {
                        "sourceField": "email",
                        "paramName": "email"
                    }, {
                        "sourceField": "compoundHashedPassword",
                        "paramName": "compoundHashedPassword"
                    }
                ],
                "maxConnections": 10
            },
            "error": ["file.format.json"]
        }, {
            "id": "file.format.json",
            "type": "file.format.json",
            "params": {
                "fileName": "invalid_accounts.json",
                "createEmptyFile": false
            },
            "next": ["sftp - writer"]
        }, {
            "id": "sftp - writer",
            "type": "datasource.write.sftp",
            "params": {
                "host": "...",
                "username": "...",
                "port": 22,
                "password": "...",
                "temporaryUploadExtension": false
            }
        }
    ]
}

Value Placeholders

In some components, you can use value placeholders, to dynamically implant values: 

  • apiKey: The API key of the site where the data transfer is performed. Supported in file names and field evaluation.

  • now (+date format): A timestamp of the time at which the record was processed. Supported in file names only.

  • unix: The timestamp of the time at which the record was processed, in seconds. Example: 1543243596. Supported in file names and field evaluation.

  • jobId: - The ID of the current job. Supported in file name and field evaluation.

For example, using the "now" placeholder to generate a file name: 

VendorName_${now:yyyyMMddssSSS}.csv

 

Component Guide

Expand/Collapse All

Datasource

datasource components perform actions on a data platform – either extracting data from it or uploading data to it.

  • datasource.read components extract data from a platform. Every dataflow begins with one of these components. Note that for components that read a file (from FTP, Azure, Amazon etc.), the maximum size of this file is 2 GB.
  • datasource.write components upload data to a target platform. These would typically be the last step in a dataflow. Writers can be followed by an "error" connector, which captures failed records.

  • datasource.lookup components extract data from a platform based on a field from another data source.
  • datasource.delete components delete information from the target platform (e.g., a user record)

Note that IdentitySync jobs are scheduled in UTC time. Therefore, the platform participating in the flow should be set to the UTC timezone to ensure that file requests are handled properly.

 datasource.delete.hybrismarketing

Deletes a contact (end-user) from the SAP Hybris Marketing system, following a deletion from the Gigya database.

RequiredParameter NameTypeDefaultDescription
contactIdFieldstring The field that holds the unique identifier (e.g: Gigya UID).
endpointstring The SAP Hybris Marketing authentication endpoint.
usernamestring The username of the SAP Hybris Marketing admin.
passwordstring The password of the SAP Hybris Marketing admin.
maxConnectionsinteger1The maximum number of concurrent connections to open. Accepts values between 1 and 100.
Sample deletion flow
 {
    "id" : "...",
    "name" : "yMarketing - Accounts Delete",
    "description" : "audit > delete",
    "steps" : [{
            "id" : "audit",
            "type" : "datasource.read.gigya.audit",
            "params" : {
                "select" : "uid",
                "where" : "action='RightToBeForgotten'",
                "deltaField" : "@timestamp",
                "from" : "auditLogEvent" 
            },
            "next" : [
                "delete"
            ]
        }, {
            "id" : "delete",
            "type" : "datasource.delete.hybrismarketing",
            "params" : {
                "username" : "...",
                "password" : "...",
                "endpoint" : "https://my300011-api.s4hana.ondemand.com/sap/opu/odata/sap/API_MKT_CONTACT_SRV",
                "contactIdField" : "uid"
                "maxConnections" : 1
            }
        }
    }

 

 

 datasource.lookup.gigya.account

In an import scenario, use this component to check if certain parameter values of a given user record already exist in Gigya's database (for example, an email address), and to decide what to do with the record. A lookup query is built using a list of source fields, and the Gigya fields to which they are compared. If a record does not exist on Gigya's platform ("mismatch"), a decision is made regarding that record - whether to reject it or push it forward in the pipeline. If the record does exist and some fields conflict, the fields are handled according to the specified behavior.
This component calls the accounts.search method.

See code sample below.

RequiredParameter NameTypeDefaultDescription
selectstring 

A comma-separated list of fields to retrieve from Gigya's account database. When a record exists both in the source and in Gigya's platform, the fields included in this parameter will be the ones merged. You can use an asterisk ( * ) to retrieve all fields, or retrieve an entire namespace (e.g., 'profile'). For example:

 "select" : "UID, profile",
lookupFieldsArray of objects 

An array of objects, where each object contains a 'sourceField' and a 'gigyaField' property. Together with lookupFieldsOperator, this is used to build a lookup query. For example:

 "lookupFields" : [{
                        "sourceField" : "campaign_email",
                        "gigyaField" : "profile.email"
                    }
                ],
lookupFieldsOperatorstringOR

The operator that is used between the lookup field objects. Acceptable values: AND, OR. The default is OR. Note that if lookupFields contains one object, the operator is meaningless (but still mandatory).

mismatchBehaviorstringprocess

Decides what to do in case there is no match between the source field and the Gigya field. Acceptable values:

  • process: Continue processing the record in the IdentitySync flow
  • error: Do not process this record, and fire an error
  • remove: Remove this record from the flow, with no errors

For example, if you are using only the "email" fields to perform a lookup, and the source email does not match any email in Gigya's database, when a "process" value is defined, the record will continue down the pipeline (e.g., will be written to Gigya's database).

handleFieldConflictsstring 

Decides what to do in case the same value exists for the field in both the source and Gigya's platforms. Note that a conflict may occur only if the source fields and Gigya's fields have the same names (e.g. when reading a file in Gigya's format, or after a rename step).

Acceptable values:

  • error: Do not process this record, and fire an error
  • take_source: Use the source value for this field, and discard Gigya's
  • take_lookup: Use Gigya's value for this field, and discard the source's
batchSizeinteger200The maximum number of records to aggregate before retrieving records from Gigya's database. Accepts values between 50 and 200.
maxConcurrencyinteger1The maximum number of threads to allocate when executing a search in parallel. Accepts values between 1 and 16 (default is 1).
{
    "id" : "...",
    "name" : "Importing accounts from SFTP to Gigya",
    "steps" : [{
            "id" : "sftp",
            "type" : "datasource.read.sftp",
            "params" : {
                "host" : "",
                "username" : "",
                "password" : "",
                "remotePath" : ""
            },
            "next" : [
                "json-parse"
            ]
        }, {
            "id" : "json-parse",
            "type" : "file.parse.json",
            "next" : [
                "account-lookup"
            ]
        }, {
            "id" : "account-lookup",
            "type" : "datasource.lookup.gigya.account",
            "params" : {
                "select" : "UID, profile",
                "handleFieldConflicts" : "take_source",
                "mismatchBehavior" : "process",
                "lookupFields" : [{
                        "sourceField" : "campaign_email",
                        "gigyaField" : "profile.email"
                    }
                ],
                "lookupFieldsOperator" : "AND",
                "batchSize" : 100,
                "maxConcurrency" : 4
            },
            "next" : [
                "import-account"
            ]
        }, {
            "id": "import-account",
            "type": "datasource.write.gigya.importaccount",
            "params": {
                "importPolicy": "insert",
                "handleIdentityConflicts": "connect",
                "maxConnections": 10
            }
        }
    ]
}

 datasource.lookup.gigya.gm

If you have implemented the Loyalty platform, use this component to retrieve the current status of a user in each of the specified challenges since the last successful execution of the dataflow. The component makes use of the gm.getChallengeStatus REST API. For a sample dataflow which extracts data both from Gigya's accounts and from Loyalty and loads into SFTP, see Game Mechanics Dataflow.

RequiredParameter NameTypeDefaultDescription
includeChallengesarray Comma separated list of challenges to include in the response.
excludeChallengesarray Comma separated list of challenges to exclude from the response.
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

 datasource.read.amazon.s3

Extract objects from Amazon's Simple Storage Service. Note that the maximum file size to read is 2 GB.

RequiredParameter NameTypeDefaultDescription

bucketName

string

 

The S3 bucket to access.

accessKey

string

 

The AWS Access Key, which identifies the user who owns the account. No two accounts can have the same AWS Access Key.

secretKey

string

 

The Secret Key that accompanies the Access Key.

objectKeyPrefix

string

The bucket root directory

The remote directory to retrieve files from.

fileNameRegexstring A regular expression (regex) used for filtering files by name.

 datasource.read.azure.blob

Extract data "blobs" from the Azure Blob cloud storage. Blobs are extracted by container and (optionally) filtered by blob name prefix. Only blobs that have changed since the last run of the dataflow will be extracted. Note that the maximum file size to read is 2 GB.

RequiredParameter NameTypeDefaultDescription

accountName

string 

The name of the Azure account.

accountKey

string 

The Azure account key.

container

string 

The name of the container in the account from which to extract the data.

 

blobPrefix

string 

If this parameter is specified, only blobs whose names begin with this prefix will be extracted.

 datasource.read.azure.blob_token

Extract data "blobs" from the Azure Blob cloud storage using an access token.

RequiredParameter NameTypeDefaultDescription
baseUristring A StorageUri object that represents the Blob service endpoint used to create the client.
accessTokenstring The access token to the Azure API.
containerstring The name of the container from which to read the data.
blobPrefixstring If this parameter is specified, only blobs whose names begin with this prefix will be extracted.

 datasource.read.azure.sas

Read files from the Azure Blob cloud storage using shared access signature. The process involves receiving a login token and an authentication token from Azure. Note that the maximum file size to read is 2 GB.

RequiredParameter NameTypeDefaultDescription
authenticationUrlstring The authentication URL from which to receive a shared access token.
blobUristring The blob URI.
clientIdstring The client ID.
clientSecretstring The client secret.
loginUrlstring The login URL for processing Azure Active Directory authentication.
blobPrefixstring If specified, only blobs whose names begin with this prefix will be extracted.

 datasource.read.campaignmonitor

Read subscription data from Campaign Monitor.

RequiredParameter NameTypeDefaultDescription
apiKeystring The Campaign Monitor API key.
listIdstring The ID of the Campaign Monitor list from which to retrieve subscriber records.
batchSizeinteger1000The maximum number of records to retrieve. Accepts values between 1 and 1000.
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values from 1 to 100.
statusstring A comma-separated list of the Campaign Monitor subscription statuses to include.
targetFieldstring The destination field with which to populate the subscription status.

 datasource.read.constantcontact

 Read subscription data from Constant Contact. For a sample dataflow, see Constant Contact Dataflow - Inbound.

RequiredParameter NameTypeDefaultDescription
apiKeystring The Constant Contact API key. See Constant Contact documentation here and here .
accessTokenstring The Constant Contact access token. See Constant Contact documentation here and here.
listIdstring The ID of the list from which subscribers are read.
baseUrlstringhttps://api.constantcontact.com/v2The Constant Contact base URL.
rateLimitinteger20The number of API calls made to Constant Contact per second. Accepts values between 1 to 20.
batchSizeinteger50

The maximum number of records to return. Accepts values between 1 and 500.

timeoutinteger60The number of seconds to wait for a response from Constant Contact.

 datasource.read.gigya.account

Extract Account data from Gigya.

  • By default, the component extracts accounts that have been updated (modified) since the last time the dataflow was executed successfully. This behavior may be changed using the deltaField parameter.
  • Additional conditions may be added using the where parameter.

This component calls the accounts.search method.

RequiredParameter NameTypeDefaultDescription
selectstring 

Comma-separated list of fields to extract. For example, enter "UID, profile.firstName" to extract the 'UID' and 'profile.firstName' fields from the Account object.

fromstringaccountsThe name of the data source.
wherestring 

An SQL-like WHERE clause to filter Account objects.

You can add an indication of time to the query in a highly flexible manner, to retrieve only records that were updated in a specific time period. For example, to retrieve users that started but did not complete the registration process in the last hour, the following query applies (see also a full dataflow below):

"where" : "isRegistered = false AND createdTimestamp < ${now-1h}"

The supported time units are:

  • y (years)
  • M (months)
  • w (weeks)
  • d (days)
  • h (hours)
  • m (minutes)
  • s (seconds)

The time indicator can be used only when the schedule is set to retrieve all records, i.e. fullExtract is set to 'true'.

deltaFieldstringlastUpdatedTimestamp

The timestamp field to filter accounts by. The component will extract accounts where this timestamp is newer than the timestamp of the last successful run of the dataflow.

Accepted values:

  • lastUpdatedTimestamp

  • lastLoginTimestamp

  • oldestDataUpdatedTimestamp

  • createdTimestamp

  • verifiedTimestamp

  • registeredTimestamp

consentarray 

For customers who are using Customer Consent, an array of consent statements and their statuses, by which to filter the queried users. For example, you can retrieve only users who have a valid consent statement assigned to their account.

Both parameters are required. The parameters for each statement are:

  • id (string): the identifier of the consent statement.
  • status (string): the status of this statement in the user's account. Possible values are:
    • valid: the user has given consent to this statement and that consent is still valid.
    • expired: the user has given consent to this statement in the past, but due to a version change the consent has expired.
    • notGranted: the user has never given consent to this statement, or has withdrawn that consent.
maxConcurrencyinteger1

The maximum number of threads to allocate when executing a search in parallel. Accepts values between 1 and 16 (default is 1).

The default value of "1" is the best practice for most scenarios. Unless performance is a critical consideration for the job you are running, this parameter should not be changed.

apiKeystring 

When copying accounts from one site to another, the API key of the site from which to read the data.

All three parameters are required when copying site accounts: apiKey, userKey and secret.

userKeystring 

When copying accounts from one site to another, the user or application key used for authentication.

All three parameters are required when copying site accounts: apiKey, userKey and secret.


secretstring 

When copying accounts from one site to another, the user or application secret used for authentication.

All three parameters are required when copying site accounts: apiKey, userKey and secret.


Following is a sample dataflow that demonstrates using datasource.read.gigya.account to load subscription status from Gigya into Mailchimp:

{
    "id" : "",
    "name" : "Subscription management",
    "description" : "account > rename > mailchimp",
    "steps" : [{
            "id" : "account",
            "type" : "datasource.read.gigya.account",
            "params" : {
                "select" : "subscriptions.mySub.email.isSubscribed",
                "from" : "emailAccounts"
            },
            "next" : ["rename"]
        }, {
            "id" : "rename",
            "type" : "field.rename",
            "params" : {
                "fields" : [{
                        "sourceField" : "subscriptions.mySub.email.isSubscribed",
                        "targetField" : "newsletterField"
                    }
                ]
            },
            "next" : ["mailchimp"]
        }, {
            "id" : "mailchimp",
            "type" : "datasource.write.mailchimp",
            "params" : {
                "apiUrl" : "https://<dc>.api.mailchimp.com/3.0/",
                "apiKey" : "",
                "listId" : "",
                "newsletterField" : "newsletterField"
            }
        }
    ]
}

 

The following dataflow demonstrates using a flexible time notation in the WHERE clause

{
    "name" : "Fetch unregistered accounts",
    "description" : "account > delete",
    "steps" : [{
            "id" : "account",
            "type" : "datasource.read.gigya.account",
            "params" : {
                "select" : "UID,isRegistered,createdTimestamp"
                "where" : "isRegistered = false AND createdTimestamp < ${now-1h}"
            },
            "next" : ["remove"]
        }, {
            "id" : "remove",
            "type" : "field.remove",
            "params" : {
                "fields" : ["isRegistered","createdTimestamp"]
            },
            "next" : ["dsv"]
        }, {
            "id" : "dsv",
            "type" : "file.format.dsv",
            "params" : {
                "fileName" : "GIGYA_TO_SFTP_${now}.csv",
                "columnSeparator" : ",",
                "quoteFields" : true
            },
            "next" : ["gzip"]
        }, {
            "id" : "gzip",
            "type" : "file.compress.gzip",
            "next" : ["sftp"]
        }, {
            "id" : "sftp",
            "type" : "datasource.write.sftp",
            "params" : {
                "host" : "idx-etl",
                "username" : "idx",
                "password" : "",
                "remotePath" : "test"
            }
        }
    ]
}

 

The following dataflow demonstrates retrieving accounts based on 3 different statuses of consent statements:

{
 "name": "Export from Gigya to SFTP",
 "description": "account > rename > dsv > sftp",
 "steps": [
  {
   "id": "account",
   "type": "datasource.read.gigya.account",
   "params": {
    "select": "profile.email,profile.lastName,profile.firstName,profile.gender",
    "batchSize": 300,
    "from": "accounts",
    "deltaField": "lastUpdatedTimestamp",
    "maxConcurrency": 1,
    "consent":[
      {
        "id":"terms.tos",
        "status":"valid"
      },
      {
        "id":"terms.allowAds",
        "status":"expired"
      },
      {         
        "id":"terms.optionalConsent",
         "status":"notGranted"       
      }
    ]
   },
   "next": [
    "rename"
   ]
  },
  {
   "id": "rename",
   "type": "field.rename",
   "params": {
    "fields": [
     {
      "sourceField": "profile.email",
      "targetField": "MAIL"
     },
     {
      "sourceField": "profile.lastName",
      "targetField": "NAME"
     },
     {
      "sourceField": "profile.firstName",
      "targetField": "FIRSTNAME"
     },
     {
      "sourceField": "profile.gender",
      "targetField": "GENDER"
     }
    ]
   },
   "next": [  
    "dsv"
   ]
  },
  {
   "id": "dsv",
   "type": "file.format.dsv",
   "params": {
    "fileName": "GIGYA_TO_SFTP_${now}.csv",
    "columnSeparator": ",",
    "quoteFields": true,
    "writeHeader": true,
    "lineEnd": "\n",
    "createEmptyFile": false
   },
   "next": [
    "sftp"
   ]
  },
  {
   "id": "sftp",
   "type": "datasource.write.sftp",
   "params": {
    "host": "...",
    "username": "...",
    "password": "...",
    "remotePath": "GigyaDaily",
    "port": 22,
    }
  }
 ]
}

 datasource.read.gigya.audit

 Search Gigya's audit log and retrieve items using an SQL-like query. This component calls the audit.search method.

RequiredParameter NameTypeDefaultDescription
selectstring A comma-separated list of fields to extract.
fromstringaudit

The audit log from which to fetch the data. Possible values:

deltaFieldstring@timestampThe timestamp field by which the audit log query is filtered. The component will extract audit log entries for which the timestamp is newer than that of the last successful dataflow run.
wherestring An SQL-like WHERE clause by which to filter the objects in the audit log.

 datasource.read.gigya.ds

 Search for and retrieve data from Gigya's Data Store (DS) using an SQL-like query. This component calls the ds.search method.

RequiredParameter NameTypeDefaultDescription
selectstring Comma-separated list of fields to extract.
fromstring The name of the data source.

wherestring An SQL-like WHERE clause to filter DS objects.
deltaFieldstringlastUpdatedTimestampThe timestamp field by which to filter accounts. The component will extract accounts where this timestamp is newer than the timestamp of the last successful run of the dataflow.

Accepted values:

  • lastUpdatedTimestamp

  • lastLoginTimestamp

  • oldestDataUpdatedTimestamp

  • createdTimestamp

  • verifiedTimestamp

  • registeredTimestamp

consentarray 

For customers who are using Customer Consent, an array of consent statements and their statuses, by which to filter the queried users. For example, you can retrieve only users who have a valid consent statement assigned to their account.

Both parameters are required. The parameters for each statement are:

  • id (string): the identifier of the consent statement.
  • status (string): the status of this statement in the user's account. Possible values are:
    • valid: the user has given consent to this statement and that consent is still valid.
    • expired: the user has given consent to this statement in the past, but due to a version change the consent has expired.
    • notGranted: the user has never given consent to this statement, or has withdrawn that consent.
maxConcurrencyinteger1

The maximum number of threads to allocate when executing a search in parallel. Accepts values between 1 and 16 (default is 1).

The default value of "1" is the best practice for most scenarios. Unless performance is a critical consideration for the job you are running, this parameter should not be changed.

 datasource.read.gigya.comment

Executes comments.search for retrieving comments from Gigya's database. This component calls the comments.search method.

RequiredNameTypeDefaultDescription
selectstring A comma-separated list of fields to extract.
fromstringcomments

The data source from which to retrieve comments. Acceptable values are:

  • comments
  • streams
wherestring An SQL-like WHERE clause by which to filter the objects in the selected data source.
maxConcurrencyinteger1The maximum number of threads to allocate when executing a search in parallel. Accepts values between 1 and 16. The default is 1.
batchSizeinteger100The maximum number of records to return. Accepts values between 50 and 300. The default is 100.

 datasource.read.ftp

Extract files from an FTP server. Note that the maximum file size to read is 2 GB.

RequiredNameTypeDefaultDescription

hoststring The remote FTP host.

usernamestring The remote FTP username.

passwordstring The remote FTP password.
encryptionstring"none"

Which encryption type to use when transferring the data.

Accepted values:

  • "none"
  • "ssl"
  • "tls"
fileNameRegexstring 

A regular expression to apply for file filtering

portinteger21

The remote FTP port.

remotePathstringThe FTP user root directory

The remote FTP directory to retrieve files from.

sortBystringtime

The field by which to sort the data.

sortOrderstringASC

The sort order:

  • ASC: ascending (this is the default)
  • DESC: descending
timeoutinteger60The timeout (in seconds) to wait for a response from FTP. The acceptable range is 10-120.

 datasource.read.mailchimp

Extracts from Mailchimp the subscription status of all the users registered under a specified mailing list. See Mailchimp Dataflow - Inbound for a full dataflow example.

RequiredParameter NameTypeDefaultDescription
apiUrlstringhttps://<dc>.api.mailchimp.com/3.0/Mailchimp's API URL, where <dc> indicates the data center used by your account.
apiKeystring Your web service secret key
listIdstring The ID of the Mailchimp list for which the subscription data will be updated
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.
retryIntervalSecondsinteger1The frequency of retry attempts (in seconds)
statusstring The Mailchimp status of the subscription for the user for this list. Accepted values are subscribed and unsubscribed. If no status is provided, all statuses will be fetched.

 datasource.read.marketo

Reads user data from Marketo.

RequiredParameter NameTypeDefaultDescription
baseUrlstring The Marketo base URL.
clientIdstring The client ID.
clientSecretstring The client secret.
fieldsstring A comma-separated list of the fields to fetch.
batchSizeinteger300The size of the batch to return. Accepts values between 1 and 300.
leadChangeFieldstringunsubscribedA comma-separated list of fields whose values will be inspected for changes since the last successful run.
timeoutinteger60The number of seconds to wait for a response from Marketo. Accepts values between 1 and 600.

 datasource.read.sftp

Extract files from an SFTP server. Note that the maximum file size to read is 2 GB.

Required Parameter NameTypeDefaultDescription

hoststring The remote SFTP host.

usernamestring The remote SFTP username.

passwordstring The remote SFTP password.
fileNameRegexstring 

A regular expression to apply for file filtering.

passphrasestring The private key passphrase.
portinteger22

The remote SFTP port.

privateKeystring The base64 private key to use upon ssh keys authentication.
remotePathstringSFTP user home directory

The remote SFTP directory to fetch files from.

sortBystringtime

The field by which to sort the data.

sortOrderstringASC

The sort order:

  • ASC: ascending (this is the default)
  • DESC: descending
timeoutinteger60The timeout (in seconds) to wait for a response from SFTP. The acceptable range is 10-120.

 datasource.read.silverpop
RequiredParameter NameTypeDefaultDescription
clientIdstring The client ID.
clientSecretstring The client secret.
hoststring The remote SFTP host.
listId string The ID of the Silverpop list to which subscription data is written.
passwordstring The remote SFTP password.
podNumberinteger The Silverpop Engage pod number used. For example, if you access engage5.silverpop.com, 5 is the pod number.
refreshTokenstring The token generated by Silverpop to allow Gigya permission to access Silverpop Engage. See Silverpop.
usernamestring The remote SFTP username.
columnsstring A comma-separated list of Silverpop columns (fields) to read. If none are specified, data from all columns will be read.
portinteger22The remote SFTP port.
retryIntervalSecondsinteger5The frequency of retry attempts (in seconds).

 datasource.write.amazon.s3

Load files to an Amazon S3 Bucket. Each file is loaded as an encrypted bucket object, and assigned a key matching the file name.

Required Parameter NameTypeDefaultDescription

bucketNamestring The name of the Amazon S3 bucket to list.

accessKeystring S3 access key

secretKeystring S3 secret key
objectKeyPrefixstring 

The name of the folder in S3 to which the file is written, followed by a forward slash. For example,

objectKeyPrefix = myFolder/

will write the file to the myFolder directory. If no such folder exists, it will be created.

 datasource.write.azure.blob

Upload data to the Azure Blob cloud storage.

RequiredParameter NameTypeDefaultDescription

accountName

string 

The name of the Azure account.

accountKey

string 

The Azure account key.

container

string 

The name of the container in the account to which the data will be uploaded.

blobPrefix

string 

The prefix to use for the blob names. Specify a prefix such as "gigya/" to create a virtual folder hierarchy.

 datasource.write.azure.blob_token

 

Upload files to the Azure Blob cloud storage using an access token.

 

RequiredParameter NameTypeDefaultDescription
baseUristring StorageUri object that represents the Blob service endpoint used to create the client.
accessTokenstring The access token to the Azure API.
containerstring The name of the container to which the data will be uploaded.

 datasource.write.azure.sas
Upload files to the Azure Blob cloud storage using shared access signature. The process involves receiving a login token and an authentication token from Azure.

 

RequiredParameter NameTypeDefaultDescription
authenticationUrlstring The authentication URL from which to get a shared access token.
blobUristring The blob URI.
clientIdstring The client ID.
clientSecretstring The client secret.
loginUrlstring The login URL for processing Azure Active Directory authentication .

 datasource.write.bluekai

Write data to the BlueKai platform.

RequiredParameter NameTypeDefaultDescription

siteIdstring The site ID provided by BlueKai.
bkuidstring The customer's Web Service User Key.

secretKeystring The customer's Web Service Secret Key.
uniqueIdFieldstring The field used as the index for synchronization with BlueKai.

domainstring"http://api.tags.bluekai.com"The BlueKai domain for API calls. 

versionstring"v1.2"The BlueKai API version.

maxConnectionsinteger10How many connections can be opened concurrently (1-100). 

retryIntervalSecondsinteger1The interval between retries, in seconds.

 datasource.write.campaignmonitor

Write subscription data to Campaign Monitor.

RequiredParameter NameTypeDefaultDescription
apiKeystring The Campaign Monitor API key.
listIdstring The ID of the Campaign Monitor list from which to retrieve subscriber records.
newsletterFieldstring The field used to indicate subscription status to this mailing list.
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values from 1 to 100.

 datasource.write.constantcontact

Write data to Constant Contact.  For a sample dataflow, see Constant Contact Dataflow - Outbound.

RequiredParameter NameTypeDefaultDescription
apiKeystring The Constant Contact API key. See Constant Contact documentation, here and here.
accessTokenstring The Constant Contact access token. See Constant Contact documentation, here and here.
listIdstring The ID of the list to which subscribers are written.
baseUrlstringhttps://api.constantcontact.com/v2The Constant Contact base URL.
rateLimitinteger20The number of API calls made to Constant Contact per second. Accepts values between 1 to 20.
maxConnectionsinteger1The maximum number of connections opened concurrently. Accepts values between 1-100.
newsletterFieldstring The field to use to indicate the status of the subscription.

 datasource.write.exacttarget

 Write subscriber data to Salesforce Marketing Cloud, also known as Exacttarget.

RequiredParameter NameTypeDefaultDescription
clientIdstring The Exacttarget client ID.
clientSecretstring The Exacttarget client secret.
dataExtensionstring The name of the Exacttarget Data Extension to which subscribers are written.
primaryKeysstring Comma-separated list of the primary keys (mandatory fields) in the Data Extension to which subscribers are written.
hoststring
https://www.exacttargetapis.com/
The Exacttarget host.
maxConnectionsinteger1The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

 datasource.write.ftp

Load files to an FTP server.

Required Parameter NameTypeDefaultDescription

hoststring The remote FTP host.

usernamestring The remote FTP username.

passwordstring The remote FTP password.
portinteger21

The remote FTP port.

remotePathstringThe FTP user root directory

The remote FTP directory in which to store files.

encryptionstring 

To encrypt the data when it is transferred, set this parameter to a type of encryption.

Accepted values:

  • "ssl"
  • "tls"
temporaryUploadExtensionBooleanfalseSet to "true" to append the extension ".uploading" to the file name while it is being uploaded. This extension will be removed when the uploading process is finished.
timeoutinteger60The timeout (in seconds) to wait for a response from FTP. The acceptable range is 10-120.

 datasource.write.gigya.account

Call the accounts.setAccountInfo method to update user accounts. This step is the final step when importing into Gigya, so any transformation to the data, including transforming a string into Boolean or handling empty fields, should be handled in a previous step. For a full sample of importing into Gigya, see Import from SFTP to Gigya.

This component calls the accounts.setAccountInfo method to update use accounts.

RequiredParameter NameTypeDefaultDescription
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.
updatePolicystringappend

Defines how to handle updates to existing values in Gigya's database.

 Valid values:

  • append
  • override

If the policy is set to "append", and the record contains an empty field (value is "null), the field will be removed from the record before writing data to Gigya, so that any existing value is not overriden.

 datasource.write.gigya.ds

This component calls the ds.store method to store object data in Gigya's Data Store.

RequiredParameter NameTypeDefaultDescription
typestring A string indicating the type of the object, where objects of the same type receive the same name.
oidstringauto

The Data Store unique object identifier (OID). Possible value options are:

  • auto: Gigya will generate a unique OID for the object
  • record: Get the OID from the OID field in the user record.
  • hard-code the value of the OID.
updateBehaviorstringarrayPush

This parameter defines how to handle conflicting updates.

Accepted values:

  • arrayPush
  • arraySet
  • replace
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

 datasource.write.gigya.generic

A generic component that can call any of Gigya's APIs, including those that write data to a Gigya database.

RequiredParameter NameTypeDefaultDescription
apiMethodstring The Gigya API method to call, including the namespace. For example: ds.store.
addResponseBooleanfalse

When set to true, the response received from the API is included in the output of this step. The response in sent as a JSON object in a _response parameter. For example:

{
    "UID" : "***",
    "profile" : {
        "nickname" : "Torres",
        "firstName" : "Tammy",
        "lastName" : "Torres",
        "gender" : "f",
        "email" : "example@email.com"
    },
    "_response" : {
          "regToken": "***..",
          "statusCode": 200,
          "errorCode": 0,
          "statusReason": "OK",
          "callId": "***",
          "time": "2015-03-22T11:42:25.943Z"
    }
}

This response can then be used as input for the next step.

apiParamsarray 

An array of objects that contain a 'sourceField' or a 'value' property, and a 'paramName' property:

RequiredNameTypeDescription
paramNamestringThe names of the parameters in the API being used.
sourceField*string

The source field from which to take the value being written.

* You are required to pass only one of the parameters: sourceField or value.

value*string

A constant value that will be written to the database. For example, if you specify "helloWorld", this is the value that will be written to this parameter.

* You are required to pass only one of the parameters: sourceField or value.

apiKeystring the API key of the site from which to read the data. This is used in a site-to-site data transfer scenario.
userKeystring The user key credentials for reading data from a Gigya site. This is used in a site-to-site data transfer scenario.
secretstring The secret key credentials for reading data from a Gigya site. This is used in a site-to-site data transfer scenario.
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

Example

 Following is an example of using the generic writer in a flow that reads data from FTP, and writes to Gigya's Data Store:

{
	"id" : "",
	"name" : "Generic Writer",
	"description" : "FTP > uncompress > jsonParse > gigyaDSWrite",
	"steps" : [{
			"id" : "ftpRead",
			"type" : "datasource.read.ftp",
			"params" : {
				"host" : "...",
				"username" : "...",
				"password" : "...",
				"remotePath" : "inbound",
				"fileNameRegex" : "import_500objects.*.gz"
			},
			"next" : ["uncompress"]
		}, {
			"id" : "uncompress",
			"type" : "file.uncompress.gzip",
			"next" : ["jsonParse"]
		}, {
			"id" : "jsonParse",
			"type" : "file.parse.json",
			"next" : ["gigyaGenericWriter"]
		}, {
			"id" : "gigyaGenericWriter",
			"type" : "datasource.write.gigya.generic",
			"params" : {
				"apiMethod" : "ds.store",
				"apiParams" : [{
						"sourceField" : "data",
						"paramName" : "data"
					}, {
						"value" : "cool_type",
						"paramName" : "type"
					}, {
						"value" : "auto",
						"paramName" : "oid"
					}, {
						"value" : "arrayPush",
						"paramName" : "updateBehavior"
					}
				]
			}
		}
	]
}

 datasource.write.hybrismarketing

Write data to the SAP Marketing Cloud platform (previously Hybris Marketing). For the full guide for setting up the implementation, see SAP Marketing Cloud

RequiredParameter NameTypeDefaultDescription
contactFieldstring The field that contains the unique identifier(e.g: Gigya's UID).
endpointstring The SAP Marketing Cloud authentication endpoint.
passwordstring The SAP Marketing Cloud password.
usernamestring The SAP Marketing Cloud user name.
emailFieldstring The name of the Gigya field that contains the contact's email. The value is mapped to the "ContactPermissionID" field in SAP Marketing Cloud.
marketingAreaFieldstring The name of the Gigya field that contains the marketing area information. The value is mapped to the "MarketingArea" field in SAP Marketing Cloud.
mobileFieldstring The name of the Gigya field that contains the contact's mobile phone number. The value is mapped to the "MobileNumber" field in SAP Marketing Cloud.
timeoutinteger The timeout (in seconds) to wait for a response from SAP Marketing Cloud.
subscriptionsarray of objects 

An array of JSON objects that contain the subscription properties:

  • newsletterField (string): The Gigya field to use for subscription status
  • communicationCategory (string): The Marketing Cloud Communication Category ID

To include a period in the name of the target field, surround the field name with single apostrophes. For example:

"subscriptions":[
      {
        "newsletterField":"data.myNewsletterField",
        "communicationCategory":"41"
      }
    ]
consentarray of objects 

An array of JSON objects that contain the consent properties:

  • consentField (string): The Gigya field to use for the consent status
  • communicationMedium (string): The marketing cloud communication medium. The default value is "EMAIL". Acceptable values: "EMAIL", "FACEBOOK", "GOOGLE_ADS", "YOUTUBE", "INSTAGRAM", "SMS", "PHONE".
  • communicationType (string): Marketing cloud communication medium type. The default value is "EMAIL". Acceptable values are "EMAIL" and "MOBILE".

To include a period in the name of the target field, surround the field name with single apostrophes. For example:

"consent":[
      {
        "consentField":"preferences.terms.isConsentGranted",
        "communicationMedium":"EMAIL",
		"communicationType" : "EMAIL"
      }
    ]
maxConnectionsinteger1The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

Following is a sample dataflow that writes the user's UID, email, first name and last name from Gigya to Hybris:

{
    "name": "Hybris Marketing outbound",
    "description": "account > rename > hybris",
    "steps": [{
            "id": "account",
            "type": "datasource.read.gigya.account",
            "params": {
                "select": "UID,profile.email,profile.firstName,profile.lastName" // read these fields from the Gigya database
            },
            "next": ["rename"]
        }, {
            "id": "rename",
            "type": "field.rename", // rename the fields to match the Hybris field format
            "params": {
                "fields": [{
                        "sourceField": "profile.email",
                        "targetField": "EmailAddress"
                    }, {
                        "sourceField": "profile.firstName",
                        "targetField": "FirstName"
                    }, {
                        "sourceField": "profile.lastName",
                        "targetField": "LastName"
                    }
                ]
            },
            "next": ["hybris"]
        }, {
            "id": "hybris",
            "type": "datasource.write.hybrismarketing", // write to Hybris Marketing
            "params": {
                "username": "...",
                "password": "...",
                "endpoint": "...",
                "contactIdField": "UID" // match between Gigya's UID field and Hybris' contact ID. 
            }
        }
    ]
}

 datasource.write.mailchimp

Write data to Mailchimp.

RequiredParameter NameTypeDefaultDescription
apiUrlstringhttps://<dc>.api.mailchimp.com/3.0/Mailchimp's API URL, where <dc> indicates the data center used by your account.
apiKeystring Your web service secret key
listIdstring The list ID provided by Mailchimp
newsletterFieldstring The field used to indicate newsletter subscription status
interestsMappingarray 

An array of the Mailchimp interests associated with this user record. Includes the following properties (both are required):

  • interestId (string) - the ID of the Mailchimp interest.
  • sourceField (string) - the source field in Gigya whose value will be copied into this Mailchimp interest.
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.
retryIntervalSecondsinteger1

The frequency of retry attempts (in seconds)

maxRetryinteger30The maximum number of retries before the job fails.

 datasource.write.marketo

Writes user data to Marketo.

RequiredParameter NameTypeDefaultDescription
baseUrlstring The Marketo base URL.
clientIdstring The client ID.
clientSecretstring The client secret.
maxConnectionsinteger100The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.
retryIntervalSecondsinteger3The frequency of retry attempts (in seconds).
timeoutinteger60The number of seconds to wait for a response from Marketo. Accepts values between 1 and 600.

 datasource.write.salesforce

Write data to Salesforce.

RequiredParameter NameTypeDefaultDescription

username

string

 

Salesforce username.

password

string

 

Salesforce password.

token

string

 

Salesforce API token.

authEndpoint

string

 

Salesforce authentication endpoint (e.g. https://login.salesforce.com/services/Soap/u/41.0).

objectType

string

 

The Salesforce object type. Possible values: account (for Salesforce's "Person Account"), contact, lead

primaryId

string

 

The field to be used as the primary ID for matching Gigya users to Salesforce person accounts/contacts/leads.

lookupId

string

 

The field to be used as an index for the “upsert” process.

 datasource.write.sftp

Load files to an SFTP server.

RequiredParameter NameTypeDefaultDescription

hoststring The remote SFTP host.

usernamestring The remote SFTP username.

(Provide 1 of these parameters)

passwordstring The remote SFTP password.
privateKeystring The base64 private key to use upon ssh keys authentication
portinteger22

The remote SFTP port.

remotePathstringSFTP user home directory

The remote SFTP directory to store files at.

passphrasestring The private key passphrase.
temporaryUploadExtensionBooleanfalseSet to "true" to append the extension ".uploading" to the file name while it is being uploaded. This extension will be removed when the uploading process is finished.
timeoutinteger60The timeout (in seconds) to wait for a response from SFTP. The acceptable range is 10-120.

 datasource.write.silverpop

 Write subscription data to Silverpop.

When an opt-out status is written to Silverpop, it may not happen immediately. Silverpop Engage places opt-out events in a queue and process them in the order they are received.

RequiredParameter NameTypeDefaultDescription
actionstring
ADD_AND_UPDATE

When a user already exists in the database, the method of handling the user's data. Supported values:

  • ADD_ONLY
  • UPDATE_ONLY
  • ADD_AND_UPDATE
clientIdstring The client ID.
clientSecretstring The client secret.
listIdstring The ID of the Silverpop list to which subscription data is written.
passwordstring The remote SFTP password.
podNumerinteger The Silverpop Engage pod number used. For example, if you access engage5.silverpop.com, 5 is the pod number.
refreshTokenstring The token generated by Silverpop to allow Gigya permission to access Silverpop Engage. See Silverpop.
usernamestring The remote SFTP username.
newsletterFieldstring 

The name of the Gigya field from which to read the users' subscription status. If this field was renamed, then use the name after the rename.

When using this field, data is written to Silverpop's built-in status field. If newsletterField is not specified, you should specify in a previous component (e.g. field.rename) a custom field to which subscription status is written.

syncFieldsarray In implementations where a unique identifier is not defined for the list, this parameter is used to indicate the unique ID of the rows to sync.

Sample outbound Silverpop flow:

{
 "name": "Silverpop_Dataflow_Outbound",
 "description": "account > rename > silverpop",
 "steps": [
  {
   "id": "account",
   "type": "datasource.read.gigya.account",
   "params": {
    "select": "UID,profile.email,data.subscribed"
   },
   "next": [
    "rename"
   ]
  },
  {
   "id": "rename",
   "type": "field.rename",
   "params": {
    "fields": [
     {
      "sourceField": "UID",
      "targetField": "GigyaUid"
     },
     {
      "sourceField": "profile.email",
      "targetField": "EMAIL"
     },
     {
      "sourceField": "data.subscribed",
      "targetField": "newsletterField"
     }
    ]
   },
   "next": [
    "silverpop"
   ]
  },
  {
   "id": "silverpop",
   "type": "datasource.write.silverpop",
   "params": {
    "clientId": "...",
    "clientSecret": "...",
    "refreshToken": "...",
    "podNumber": 1,
    "username": "...",
    "password": "...",
    "listId": "...",
    "newsletterField": "newsletterField"
   }
  }
 ]
}

 

Field

field components modify the dataset in various ways after it is extracted from the source platform.

 field.add

Adds fields to the record, and inserts a value to each field.

The field type is inferred automatically by the value. For example, a value of 4 infers an integer, and "4" infers a string.

RequiredParameter NameTypeDefaultDescription
fieldsarray 

An array of the fields to add. Properties are:

  • field (string) - the name of the field to add.
  • value - the value of the field.

 field.array.join

Joins the elements of an array into a string. The elements will be joined by a specified separator. The default separator is a comma (,).

Required Parameter NameTypeDefaultDescription

fieldstring The name of the field to transform.
separatorstring,

The separator used for delimiting the array elements.

Example

Data BEFORE

dataflow step 

Data AFTER

{
"categories":[
"Musician/band",
"Company"
]
}
{
"id":"...",
"type":"field.array.join",
"params":{
"field":"categories",
"separator":", "
}
}
{
"categories":"Musician/band, Company"
}

 field.array.extract

Produce multiple records from an array field - a record for each element of the array.

Required Parameter NameTypeDefaultDescription

field

string

 

The field to normalize.

generateKey

Boolean

true

Whether to generate a unique identifier for each row in the output table.

propagationFields

array

 

An array of strings containing one or more fields to add to each array entry

Example

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"email":"tammy@torres.com",
"phones":[
{
"type":"mobile",
"number":"0521111111"
},
{
"type":"work",
"number":"0542221111"
}
]
}
}
{
  "id": "...",
  "type": "field.array.extract",
  "params": {
    "field": "profile.phones",
    "propagationFields": [
"profile.email"
] }, "next": [ "..." ] }
key

email

type

number

0c36479f81f351afd321bab4700ca0

tammy@torres.com

mobile

0521111111

1b2e0455067c63c5c54024777661ea

tammy@torres.com

work

0542221111

 field.copy

Duplicate one or more fields. For every source field, this component creates a target field with a new name, and populates it with values from the source field.

Required Parameter NameTypeDefaultDescription

fieldsarray 

An array containing one or more pairs of fields in the format shown below.

[
    {
		"sourceField":"<source field name>"
		"targetField":"<target field name>"
	},
	...
 ]

Example

Data BEFOREDataflow Step Data AFTER
{
"profile":{
"firstName":"David"
}
}
{
  "id": "...",
  "type": "field.copy",
    "params":{
     "fields":[
{
"sourceField":"profile.firstName",
"targetField":"First Name"
 }
]
  },
  "next": [
     "..."
  ]
}
{
"profile":{
"firstName":"David"
},
  "First Name": "David
}

 field.date.format

Convert a field value from a given date/time format to another date/time format.

Required Parameter NameTypeDefaultDescription

fieldsarray 

An array containing field(s) to change and the source format and desired target format for each field, as follows:

[
    {
		"field":"<field name>",
		"sourceFormat":"<source format>"
		"targetFormat":"<desired format>"
	},
	...
 ]

For example:

[
	{
		"field": "LAST_CONNECTION_DATE",
		"sourceFormat": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",
		"targetFormat": "dd/MM/yyyy HH:mm"
	},
	{
		"field": "SUBSCRIPTION_DATE",
		"sourceFormat": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",
		"targetFormat": "dd/MM/yyyy"
	}
 ]

To specify the source and target formats, use DateTimeFormatter patterns.

 field.evaluate

Evaluate fields using JEXL (See JEXL documentation and JEXL Syntax Overview) or jsonpath (see documentation), and assigns the result to other fields.

This component is especially useful for handling various "null" values. See second example below.

Required Parameter NameTypeDefaultDescription

fieldsObject Array 

An object array where each object includes:

  • expression (string) - the JEXL or jsonpath expression

    Note that in the expression, field names should be spelled with an underscore _ replacing the period. For example, profile.firstName becomes profile_firstName. See also example below.

  • field (string) - the name of the field in which to place the result of the expression

For example, the following parameter value will set the value of the profile.gender field to Male, Female or Unspecified when receiving m, f or null, accordingly.

[
    {
       "field":"profile.gender",
       "expression":"profile_gender eq 'm' ? 'Male' : (profile_gender eq 'f' ? 'Female' : 'Unspecified')"
    }
]

This is an example of adding a new hierarchical field, using JEXL:

[
	{
            "field": "data.newsletters.textbreakingnews",
            "expression": "this"
	}
]

languagestring The language used for evaluation. Supported values: jsonpath, jexl

Example

To extract a specific phone type, you can use the following jsonpath:

{
    "id": "...",
    "type": "field.evaluate",
    "params": {
        "fields": [
            {
                "field": "PersonHomePhone",
                "expression": "$.profile[?(@.phones[?(@.type=='LandlineNumber')])].number"
            },
            {
                "field": "PersonMobilePhone",
                "expression": "$.profile[?(@.phones[?(@.type=='MobileNumber')])].number"
            }
        ],
        "language": "jsonpath"
    },
    "next": [
        "..."
    ]
},

Example - Passing "Null"

 The following sample code demonstrates transforming null values from Gigya into "N/A" strings, which is the acceptable format in Salesforce for nulls.

 {
            "id": "clean", // Any recognizable name you choose for this step. 
            "type": "field.evaluate",
            "params": {
                "fields": [
                    {
                        "field": "Golf_Handicap_Number__pc",
                        "expression": "Golf_Handicap_Number__pc eq '' || Golf_Handicap_Number__pc eq null ? '#N/A' : Golf_Handicap_Number__pc"
                    }
                ],
                "language": "jexl"
            },
            "next": [
                "..." // Specify the next step in the dataflow
            ]
        },

You can use the field.evaluate component to unite the three fields that indicate a birth date, into one "birthday" field:

 {
            "id": "eval",
            "type": "field.evaluate",
            "params": {
                "fields": [
                    {
                        "field": "birthdate",
                        "expression": "profile.birthYear && profile.birthMonth && profile.birthDay ? profile.birthYear + '-' + (size('_' + profile.birthMonth) == 2 ? '0' : '') + profile.birthMonth + '-' + (size('_' + profile.birthDay) == 2 ? '0' : '') + profile.birthDay : ''"
                    }
                ],
                "language": "jexl"
            },
            "next": [
                "rename"
            ]
        }

 field.flatten

Move the children of a specified field or fields one level up in the hierarchy.

Required Parameter NameTypeDefaultDescription

fieldsstring array 

A string array in which each entry is a field name.

Example

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}
{
"id":"...",
"type":"field.flatten",
"params":{
fields: ["profile"]
}

}

{
"firstName":"Tammy",
"lastName":"Torres"
}

If a selected field's children have their own children, these will not be flattened.

 field.move

Move a field or fields to another location, i.e. change the order of columns in the table.

Required Parameter NameTypeDefaultDescription

fieldsarray An array of strings, containing one or more fields to be moved.

afterstring The name of the field after which the original fields will be placed (the fields will be placed in the order in which they are listed in the fields parameter).

 field.remove

 Remove a field or several fields from the dataset. For example, if you have retrieved a set of profile fields, and renamed 3 fields so that they now no longer follow the profile.<fieldname> format, you can remove any remaining profile fields like so:

 {
   "id": "remove",
   "type": "field.remove",
   "params": {
    "fields": [
     "profile"
    ]
   },
   "next": [
    "dsv"
   ]
  },
Required Parameter NameTypeDefaultDescription

fieldsarray An array of strings, containing one or more fields to be removed.

 field.rename

Rename one or more fields. The fields can be simple field types or objects.

Required Parameter NameTypeDefaultDescription

fieldsArray 

An array of objects where each object contains a "sourceField" property and a "targetField" property.

See examples below.

To include a period in the name of the target field, surround the field name with single apostrophes. For example:

"fields":[
      {
        "sourceField":"profile.firstName",
        "targetField":"'f.Name'"
      }
    ]

Examples

Example #1: Renaming an object

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}
{
"id":"...",
"type":"field.rename",
"params":{
"fields":[
{
"sourceField":"profile",
"targetField":"My Profile"
 }
]
}
}
{
"My Profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}

Example #2: Renaming fields inside an object in a way that flattens the hierarchy

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}
{
"id":"...",
"type":"field.rename",
"params":{
"fields":[ { "sourceField":"profile.firstName", "targetField":"First Name" }, { "sourceField":"profile.lastName", "targetField":"Last Name" }
]
} }
{
"First Name":"Tammy",
"Last Name":"Torres"
}

Example #3: Renaming a nested array entry attribute such as "profile.phones.number". In this case the transformer will extract the number attribute from each phone entry inside the phones array, resulting an array of number attributes.

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres",
"gender":"f",
"phones":[
{
"type":"mobile",
"number":"0521111111"
},
{
"type":"work",
"number":"0542221111"
}
]
}
}
{
"id":"...",
"type":"field.rename",
"params":{
"fields":[ { "sourceField":"profile.phones.number", "targetField":"Phone Numbers" }
]
}
}
{
"profile":{
"firstName":"Tammy",
"lastName":"Torres",
"gender":"f",
"phones":[
{
"type":"mobile"
},
{
"type":"work"
}
]
},
"Phone Numbers":[
"0521111111",
"0542221111"
]
}

 field.replace

Find and replace strings within field values using regex.

Required Parameter NameTypeDefaultDescription
fieldsarray 

An array of field objects in which to replace strings. Contains the following properties:

  • "field": A field name, or a wildcard "*" to find and replace strings in all the fields.
  • "regex": the regex pattern to be replaced.
  • "replacement": the new string that replaces the original string.

Example

Execute the following steps to replace the "\n" line break with a space, and to remove all non-ASCII characters, in case the target platform requires such a format:

{  
       "id":"...",
       "type":"field.replace",
       "params":{  
          "fields":[  
             {  
                "field":"*",
                "regex":"(\\|)|(\n)",
                "replacement":" "
             },
             {  
                "field":"*",
                "regex":"[^\\x00-\\x7F]",
                "replacement":""
             }
          ]
       },
       "next":[  
          "..."
       ]
    },

 field.select

 Select specific fields to pass on to the next step of the data flow. All other fields will be removed.

RequiredParameter NameTypeDescription 
fieldsarray

An array of the fields to pass on to the next step. For example:

{
"id": "select",
"type": "field.select",
"params": {
	"fields": [
		"UID",
		"status"
		]
	},
"next": [
"next_step"
]
},
 

 

File

file components determine the format, file compression, and other properties of the output of the dataflow, before it is uploaded to the target platform.

 file.ack

Creates an empty file with the same name as the input file.

Required Parameter NameTypeDefaultDescription
fileExtensionstringackThe name of the file.

 file.compress.gzip

Apply GZIP compression to files.

This component has no parameters.

 file.compress.lzo

Apply LZO compression to files.

Required Parameter NameTypeDefaultDescription
createIndexFileBooleanfalse

Whether to create an additional index file.

 file.compress.zip

Apply ZIP compression to files. 

This component has no parameters.

 file.empty

Create an empty file with a configurable name.

Required Parameter NameTypeDefaultDescription

fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started)
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv

.

Example

 Using the fileName parameter to set the name of the file:

{  
         "id":"empty",
         "type":"file.empty",
         "params":{  
            "fileName":"name_${now}.csv"
         },         
      },

 file.decrypt.pgp

Decrypt PGP-encrypted data.

This component has no parameters. When a new dataflow is created that uses file.decrypt.pgp, the ETL service automatically generates a 2048-bit PGP public key. This public key can be viewed by calling the idx.getDataflow API method to view the new dataflow.

Only PGP is officially supported, and not PGP-compatible formats.

 

Usage Instructions

Sample Dataflow

The following dataflow is an example of a dataflow with file.decrypt.pgp with a PGP public key that was generated automatically. The value of pgpKey.publicKey should be given to the customer so that they can encrypt their data with it.

 

{  
   "name":"Job001",
   "description":"General Inbound flow include PGP decryption",
   "steps":[  
      {  
         "id":"sftp",
         "type":"datasource.read.sftp",
         "params":{  
            "host":"localhost",
            "username":"abcd",
            "password":"1234",
            "remotePath":"inbound/general",
            "fileNameRegex":"JOB001_.*.pgp"
         },
         "next":[  
            "pgp"
         ]
      },
      {  
         "id":"pgp",
         "type":"file.decrypt.pgp",
         "params":{  
            "pgpKey.publicKey":"73lkkjjk34j4j4k3h5h5435l2542lbj452g3kj5g43h25g2k35g4h5j4g5435345kgh53j3g454gj5h43jh5g43534f2",
         },
         "next":[  
            "csv"
         ]
      },
      {  
         "id":"csv",
         "type":"file.parse.dsv",
         "params":{  
            "columnSeparator":","
         },
         "next":[  
            "rename"
         ]
      },
      {  
         "id":"rename",
         "type":"field.rename",
         "params":{  
            "fields":[  
               {  
                  "sourceField":"EVENING_ABC",
                  "targetField":"data.newslettersEvening.abc"
               },
               {  
                  "sourceField":"EVENING_CORREO",
                  "targetField":"data.newslettersEvening.correo"
               }
            ]
         },
         "next":[  
            "account"
         ]
      },
      {  
         "id":"account",
         "type":"datasource.write.gigya.account"
      }
   ]
}

 file.encrypt.pgp

Encrypt data using PGP encryption.

Only PGP is officially supported, and not PGP-compatible formats.

Required Parameter NameTypeDefaultDescription

publicKey

string

 

The base 64 encoding of the 2048-bit public key for PGP encryption.

asciiArmor
booleantrueWhether to use ASCII armor -- a binary-to-textual encoding converter which involves encasing encrypted messaging in ASCII so that they can be sent in a standard messaging format.

Usage Instructions

 

 file.format.aam

Format a dataset as a text file in the format required by the AAM platform.

Required Parameter NameTypeDefaultDescription

fileNamestring 

The name of the file to be uploaded to AAM

precedingKeyBooleantrueWhether to include '-' before each key.

The output of this step is a text file with records in the following format: <UID><TAB>-KEY=VALUE,-KEY=VALUE,...-KEY=VALUE, for example:

 

423g711f89b147a581471ce459dc2899 -"age"="24",-"gender"="m"

 

30A3XVJciH95WE3485ygf49967ee3MY -"age"="30",-"gender"="f"


 file.format.count

Add a file containing a record count (row count) to each output file.

Required Parameter NameTypeDefaultDescription

fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv
createEmptyFileBooleanfalse

Whether to create a file if the row count is 0. If set to "true", a count file will be created containing the number 0. Otherwise, no count file will be created.

Example

Using the fileName parameter to set the name of the file:

{  
         "id":"dsv",
         "type":"file.format.dsv",
         "params":{  
            "fileName":"linda_${now}.csv",
            "columnSeparator":",",
            "quoteFields":true
         },         
      },

 file.format.dsv

Write the dataset to a plain-text file in DSV (delimiter-separated values) format.

 Required Parameter NameTypeDefaultDescription

columnSeparatorchar,The delimiter character, such as "," or "|". This character will be used to separate field values in the DSV file.
escapeCharacterchar"The character to use before a special character (such as forward slash / ) to prevent its execution.
fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv

  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv

If you also specify the maxFileSize, any file created after the first will also include a period and serial number. For example, if the first file is called myfile_20170101.csv, the second file of the same job will be called myfile_20170101.1.csv .

maxFileSizeinteger5000

The maximum size of the file (in MB) when writing records to the target destination. This parameter enables dividing the output file into smaller parcels, for load distribution. After the first file is created with the first batch of records, the job will continue to process the next records and create a new file of roughly the same size. This cannot exceed 5000 MB.

  • The cutoff point of each file is always after a record was added to the file, so that the actual size of the output files is always slightly larger. For example, if you specified a maxFileSize of 10 MB, the output may be 10.1 MB, so you should take a buffer into account.
  • The last file in each job will usually be smaller.
  • Specifying a file size will cause a serial numbering to be added to the output files. For example, if the first file is called myfile_20170101.csv, the second file of the same job will be called myfile_20170101.1.csv .


createEmptyFileBooleanfalse

Whether to create an empty file if no data was extracted or transferred.

columnsstring array 

Column names are inferred by assuming all rows in the source data share the same column names, and the column names are found in the first row. If you wish to set different column names, use this parameter to pass those names as an array of strings.

quoteFieldsBooleanfalse

Set to true to quote all the DSV entries. Default value is false - fields will get quoted when needed.

quoteHeaderBoolean Set to true to quote the header.
writeHeaderBooleantrueWhether to include the DSV header row in the output file. Set to "false" to create a file with no header row.
lineEndstring\nThe line break used in the DSV file.

Example

Using the fileName parameter to set the name of the file:

   {  
         "id":"dsv",
         "type":"file.format.dsv",
         "params":{  
            "fileName":"linda_${now}.csv",
            "columnSeparator":",",
            "quoteFields":true
         },         
      },

 file.format.json

 Write the dataset to a JSON file.

 RequiredParameter NameTypeDefaultDescription
fileNameString 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.json"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.json
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.json

If you also specify the maxFileSize, any file created after the first will also include a period and serial number. For example, if the first file is called myfile_20170101.json, the second file of the same job will be called myfile_20170101.1.json .

createEmptyFileBooleanfalse

Whether to create an empty file if no data was extracted or transferred.

maxFileSizeinteger 

The maximum size of the file (in MB) when writing records to the target destination. This parameter enables dividing the output file into smaller parcels, for load distribution. After the first file is created with the first batch of records, the job will continue to process the next records and create a new file of roughly the same size. This cannot exceed 5000 MB.

  • The cutoff point of each file is always after a record was added to the file, so that the actual size of the output files is always slightly larger. For example, if you specified a maxFileSize of 10 MB, the output may be 10.1 MB, so you should take a buffer into account.
  • The last file in each job will usually be smaller.
  • Specifying a file size will cause a serial numbering to be added to the output files. For example, if the first file is called myfile_20170101.json, the second file of the same job will be called myfile_20170101.1.json .


wrapFieldstring 

Wraps the content with this value. For example, if without a wrapField parameter your output looks like this:

[
{"UID":"123"},
{"UID":"456"},
]

with the value "results" in this parameter, the output would look like this:

{"results":
[
{"UID":"123"},
{"UID":"456"},
]
}

Example

Using the fileName parameter to set the name of the file: 

{  
         "id":"json",
         "type":"file.format.json",
         "params":{  
            "fileName":"gigya_${now}.json",           
         },         
      },

 file.format.krux

Format the data as a textual file in the format required for the Krux integration.

Output Format:

  • Each row in the output file contains 2 columns, delimited by the caret (^) symbol.
    • The first column represents the GUID and must match the GUID used in the user matching process. 
    • The second column contains all the data associated with the user and should be in the following format: "":"<Attribute_Value_1>","":"<Attribute_Value_2>", ...
  • Each row can contain an arbitrary number of attribute value combinations.
  • It is not mandatory for values to be specified for all attributes for a given user.

For example, if there are two columns called Age (representing age group) and Gender in the client's registration database, that need to be imported into the Krux platform, then the following represents a valid data file that can be ingested by Krux:

User1234^"gender":"male","age":"24"
User2345^"gender":"female"
User3456^"age":"35-44"

Required

  Parameter NameTypeDefaultDescription

fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv
createEmptyFileBooleanfalse

Whether to create an empty file if no data was extracted or transferred.

quoteFieldsBooleanfalseWhen set to true, surrounds values with quotation marks.
separatorstring;The delimiter character that separates field values in the output file. Default is semi-colon (;).

Example

 Using the fileName parameter to set the name of the file:

{  
         "id":"krux",
         "type":"file.format.krux",
         "params":{  
            "fileName":"linda_${now}.csv",
            "separator":",",
            "quoteFields":true
         },         
      },

 file.parse.dsv

Parse files in DSV format. 

RequiredParameter NameTypeDefaultDescription
columnSeparatorstring 

The separator between each column.

inferTypesBooleantrueWhen set to true, an attempt will be made to parse the value in string fields to Boolean, long or double (in that order)

 file.parse.json

Parse files in JSON format.

This component has no parameters.

 file.uncompress.gzip

Extracts files that were compressed using GZIP compression.

This component has no parameters.

 file.uncompress.lzo

Extracts files that were compressed using LZO compression.

This component has no parameters.

 file.uncompress.zip

Extracts files that were compressed using ZIP compression. 

This component has no parameters.

Record

record components run on all records. These are used when creating custom components that use JavaScript. For more information, see IdentitySync Custom Scripts.

 record.evaluate

 Use this component to insert custom dataflow steps. In the "script" parameter, insert a Base64 encoding of your custom script.

RequiredParameter NameTypeDefaultDescription
scriptstring A Base64 encoding of your custom script.

   

 

 

  • No labels