Script Repository

Skip to end of metadata
Go to start of metadata

Overview

This document lists the available scripts in the IdentitySync Script Repository. Use these scripts when defining Dataflows.

For an introduction to IdentitySync, read the IdentitySync guide.

Script Guide

Expand/Collapse All

Datasource

datasource scripts perform actions on a data platform – either extracting data from it or uploading data to it.

  • datasource.read scripts extract data from a platform. Every dataflow begins with one of these scripts.
  • datasource.write scripts upload data to a target platform. These would typically be the last step in a dataflow.
  • datasource.lookup scripts extract data from a platform based on a field from another data source.


 datasource.lookup.gigya.gm

If you have implemented the Loyalty platform, use this script to retrieve the current status of a user in each of the specified challenges since the last successful execution of the dataflow. The script makes use of the gm.getChallengeStatus REST API. For a sample dataflow which extracts data both from Gigya's accounts and from Loyalty and loads into SFTP, see Game Mechanics Dataflow.

RequiredParameter NameTypeDefaultDescription
includeChallengesarray Comma separated list of challenges to include in the response.
excludeChallengesarray Comma separated list of challenges to exclude from the response.
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

 datasource.read.amazon.s3

Extract objects from Amazon's Simple Storage Service.

RequiredParameter NameTypeDefaultDescription

bucketName

string

 

The S3 bucket to access.

accessKey

string

 

The AWS Access Key, which identifies the user who owns the account. No two accounts can have the same AWS Access Key.

secretKey

string

 

The Secret Key that accompanies the Access Key.

objectKeyPrefix

string

The bucket root directory

The remote directory to retrieve files from.

 datasource.read.azure.blob

Extract data "blobs" from the Azure Blob cloud storage. Blobs are extracted by container and (optionally) filtered by blob name prefix. Only blobs that have changed since the last run of the dataflow will be extracted.

RequiredParameter NameTypeDefaultDescription

accountName

string 

The name of the Azure account.

accountKey

string 

The Azure account key.

container

string 

The name of the container in the account from which to extract the data.

 

blobPrefix

string 

If this parameter is specified, only blobs whose names begin with this prefix will be extracted.

 datasource.read.azure.sas

Read files from the Azure Blob cloud storage using shared access signature. The process involves receiving a login token and an authentication token from Azure.

RequiredParameter NameTypeDefaultDescription
authenticationUrlstring The authentication URL from which to receive a shared access token.
blobUristring The blob URI.
clientIdstring The client ID.
clientSecretstring The client secret.
loginUrlstring The login URL for processing Azure Active Directory authentication.
blobPrefixstring If specified, only blobs whose names begin with this prefix will be extracted.

 datasource.read.gigya.account

Extract Account data from Gigya.

  • By default, the script extracts accounts that have been updated (modified) since the last time the dataflow was executed successfully. This behavior may be changed using the deltaField parameter.
  • Additional conditions may be added using the where parameter.
RequiredParameter NameTypeDefaultDescription
selectstring 

Comma-separated list of fields to extract. For example, enter "UID, profile.firstName" to extract the 'UID' and 'profile.firstName' fields from the Account object.

fromstringaccountsThe name of the data source.
wherestring An SQL-like WHERE clause to filter Account objects.
deltaFieldstringlastUpdatedTimestamp

The timestamp field to filter accounts by. The script will extract accounts where this timestamp is newer than the timestamp of the last successful run of the dataflow.

Accepted values:

  • lastUpdatedTimestamp

  • lastLoginTimestamp

  • oldestDataUpdatedTimestamp

  • createdTimestamp

  • verifiedTimestamp

  • registeredTimestamp

maxConcurrencyinteger1

The maximum number of threads to allocate when executing a search in parallel. Accepts values between 1 and 16 (default is 1).

The default value of "1" is the best practice for most scenarios. Unless performance is a critical consideration for the job you are running, this parameter should not be changed.

Following is a sample dataflow that demonstrates using datasource.read.gigya.account to load subscription status from Gigya into Mailchimp:

{
    "id" : "",
    "name" : "Subscription management",
    "description" : "account > rename > mailchimp",
    "steps" : [{
            "id" : "account",
            "type" : "datasource.read.gigya.account",
            "params" : {
                "select" : "subscriptions.searchTest.email.isSubscribed",
                "from" : "emailaccount"
            },
            "next" : ["rename"]
        }, {
            "id" : "rename",
            "type" : "field.rename",
            "params" : {
                "fields" : [{
                        "sourceField" : "subscriptions.searchTest.email.isSubscribed",
                        "targetField" : "newsletterField"
                    }
                ]
            },
            "next" : ["mailchimp"]
        }, {
            "id" : "mailchimp",
            "type" : "datasource.write.mailchimp",
            "params" : {
                "apiUrl" : "https://<dc>.api.mailchimp.com/3.0/",
                "apiKey" : "",
                "listId" : "",
                "newsletterField" : "newsletterField"
            }
        }
    ]
}

 datasource.read.gigya.audit

 Search Gigya's audit log and retrieve items using an SQL-like query.

RequiredParameter NameTypeDefaultDescription
selectstring A comma-separated list of fields to extract.
deltaFieldstring@timestampThe timestamp field by which the audit log query is filtered. The script will extract audit log entries for which the timestamp is newer than that of the last successful dataflow run.
wherestring An SQL-like WHERE clause by which to filter the objects in the audit log.

 datasource.read.gigya.ds

 Search for and retrieve data from Gigya's Data Store (DS) using an SQL-like query. 

RequiredParameter NameTypeDefaultDescription
selectstring Comma-separated list of fields to extract.
fromstring The name of the data source.

wherestring An SQL-like WHERE clause to filter DS objects.
deltaFieldstringlastUpdatedTimestampThe timestamp field by which to filter accounts. The script will extract accounts where this timestamp is newer than the timestamp of the last successful run of the dataflow.

Accepted values:

  • lastUpdatedTimestamp

  • lastLoginTimestamp

  • oldestDataUpdatedTimestamp

  • createdTimestamp

  • verifiedTimestamp

  • registeredTimestamp

maxConcurrencyinteger1

The maximum number of threads to allocate when executing a search in parallel. Accepts values between 1 and 16 (default is 1).

The default value of "1" is the best practice for most scenarios. Unless performance is a critical consideration for the job you are running, this parameter should not be changed.

 datasource.read.ftp

Extract files from an FTP server.

RequiredNameTypeDefaultDescription

hoststring The remote FTP host.

usernamestring The remote FTP username.

passwordstring The remote FTP password.
encryptionstring"none"

Which encryption type to use when transferring the data.

Accepted values:

  • "none"
  • "ssl"
  • "tls"
fileNameRegexstring 

A regular expression to apply for file filtering

portinteger21

The remote FTP port.

remotePathstringThe FTP user root directory

The remote FTP directory to retrieve files from.

sortBystringtime

The field by which to sort the data.

sortOrderstringASC

The sort order:

  • ASC: ascending (this is the default)
  • DESC: descending

 datasource.read.mailchimp

Extracts from Mailchimp the subscription status of all the users registered under a specified mailing list. See Mailchimp Dataflow - Inbound for a full dataflow example.

RequiredParameter NameTypeDefaultDescription
apiUrlstringhttps://<dc>.api.mailchimp.com/3.0/Mailchimp's API URL, where <dc> indicates the data center used by your account.
apiKeystring Your web service secret key
listIdstring The ID of the Mailchimp list for which the subscription data will be updated
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.
retryIntervalSecondsinteger1The frequency of retry attempts (in seconds)
statusstring The Mailchimp status of the subscription for the user for this list. Accepted values are subscribed and unsubscribed. If no status is provided, all statuses will be fetched.

 datasource.read.marketo

Reads user data from Marketo.

RequiredParameter NameTypeDefaultDescription
baseUrlstring The Marketo base URL.
clientIdstring The client ID.
clientSecretstring The client secret.
fieldsstring A comma-separated list of the fields to fetch.
batchSizeinteger300The size of the batch to return. Accepts values between 1 and 300.
leadChangeFieldstringunsubscribedA comma-separated list of fields whose values will be inspected for changes since the last successful run.
timeoutinteger60The number of seconds to wait for a response from Marketo. Accepts values between 1 and 600.

 datasource.read.sftp

Extract files from an SFTP server.

Required Parameter NameTypeDefaultDescription

hoststring The remote SFTP host.

usernamestring The remote SFTP username.

passwordstring The remote SFTP password.
fileNameRegexstring 

A regular expression to apply for file filtering.

passphrasestring The private key passphrase.
portinteger22

The remote SFTP port.

privateKeystring The base64 private key to use upon ssh keys authentication.
remotePathstringSFTP user home directory

The remote SFTP directory to fetch files from.

sortBystringtime

The field by which to sort the data.

sortOrderstringASC

The sort order:

  • ASC: ascending (this is the default)
  • DESC: descending

 datasource.write.amazon.s3

Load files to an Amazon S3 Bucket. Each file is loaded as an encrypted bucket object, and assigned a key matching the file name.

Required Parameter NameTypeDefaultDescription

bucketNamestring The name of the Amazon S3 bucket to list.

accessKeystring S3 access key

secretKeystring S3 secret key
objectKeyPrefixstring 

An optional parameter restricting the response to keys beginning with the specified prefix. Use prefixes to separate a bucket into different sets of keys, similar to how a file system organizes files into directories, using the following format: (<prefix><file name>).

 datasource.write.azure.blob

Upload data to the Azure Blob cloud storage.

RequiredParameter NameTypeDefaultDescription

accountName

string 

The name of the Azure account.

accountKey

string 

The Azure account key.

container

string 

The name of the container in the account to which the data will be uploaded.

blobPrefix

string 

The prefix to use for the blob names. Specify a prefix such as "gigya/" to create a virtual folder hierarchy.

 datasource.write.azure.sas
Upload files to the Azure Blob cloud storage using shared access signature. The process involves receiving a login token and an authentication token from Azure.

 

RequiredParameter NameTypeDefaultDescription
authenticationUrlstring The authentication URL from which to get a shared access token.
blobUristring The blob URI.
clientIdstring The client ID.
clientSecretstring The client secret.
loginUrlstring The login URL for processing Azure Active Directory authentication .

 datasource.write.bluekai

Write data to the BlueKai platform (see BlueKai integration documentation).

RequiredParameter NameTypeDefaultDescription

siteIdstring The site ID provided by BlueKai.
bkuidstring The customer's Web Service User Key.

secretKeystring The customer's Web Service Secret Key.
uniqueIdFieldstring The field used as the index for synchronization with BlueKai.

domainstring"http://api.tags.bluekai.com"The BlueKai domain for API calls. 

versionstring"v1.2"The BlueKai API version.

maxConnectionsinteger10How many connections can be opened concurrently (1-100). 

retryIntervalSecondsinteger1The interval between retries, in seconds.

 datasource.write.exacttarget

 Write subscriber data to Salesforce Marketing Cloud, also known as Exacttarget.

RequiredParameter NameTypeDefaultDescription
clientIdstring The Exacttarget client ID.
clientSecretstring The Exacttarget client secret.
dataExtensionstring The name of the Exacttarget Data Extension to which subscribers are written.
primaryKeysstring Comma-separated list of the primary keys (mandatory fields) in the Data Extension to which subscribers are written.
hoststring
https://www.exacttargetapis.com/
The Exacttarget host.
maxConnectionsinteger1The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

 datasource.write.ftp

Load files to an FTP server.

Required Parameter NameTypeDefaultDescription

hoststring The remote FTP host.

usernamestring The remote FTP username.

passwordstring The remote FTP password.
portinteger21

The remote FTP port.

remotePathstringThe FTP user root directory

The remote FTP directory in which to store files.

encryptionstring 

To encrypt the data when it is transferred, set this parameter to a type of encryption.

Accepted values:

  • "ssl"
  • "tls"
temporaryUploadExtensionBooleanfalseSet to "true" to append the extension ".uploading" to the file name while it is being uploaded. This extension will be removed when the uploading process is finished.

 datasource.write.gigya.account

Call the accounts.setAccountInfo method to update user accounts. This step is the final step when importing into Gigya, so any transformation to the data, including transforming a string into Boolean or handling empty fields, should be handled in a previous step. For a full sample of importing into Gigya, see Import from SFTP to Gigya.

RequiredParameter NameTypeDefaultDescription
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

 datasource.write.gigya.ds

Store an object data in Gigya's Data Store (DS). 

RequiredParameter NameTypeDefaultDescription
typestring A string indicating the type of the object, where objects of the same type receive the same name.
oidstringautoA unique object identifier. If this parameter is set with the 'auto' value, Gigya will generate a unique ID for the object.
updateBehaviorstringarrayPush

This parameter defines how to handle conflicting updates.

Accepted values:

  • arrayPush
  • arraySet
  • replace
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.

 datasource.write.mailchimp

Write data to Mailchimp.

RequiredParameter NameTypeDefaultDescription
apiUrlstringhttps://<dc>.api.mailchimp.com/3.0/Mailchimp's API URL, where <dc> indicates the data center used by your account.
apiKeystring Your web service secret key
listIdstring The list ID provided by Mailchimp
newsletterFieldstring The field used to indicate newsletter subscription status
maxConnectionsinteger10The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.
retryIntervalSecondsinteger1

The frequency of retry attempts (in seconds)

maxRetryinteger30The maximum number of retries before the job fails.

 datasource.write.marketo

Writes user data to Marketo.

RequiredParameter NameTypeDefaultDescription
baseUrlstring The Marketo base URL.
clientIdstring The client ID.
clientSecretstring The client secret.
maxConnectionsinteger100The maximum number of connections that can be opened concurrently. Accepts values between 1 and 100.
retryIntervalSecondsinteger3The frequency of retry attempts (in seconds).
timeoutinteger60The number of seconds to wait for a response from Marketo. Accepts values between 1 and 600.

 datasource.write.salesforce

Write data to Salesforce.

RequiredParameter NameTypeDefaultDescription

username

string

 

Salesforce username.

password

string

 

Salesforce password.

token

string

 

Salesforce API token.

authEndpoint

string

 

Salesforce authentication endpoint (e.g. https://login.salesforce.com/services/Soap/u/33.0).

objectType

string

 

The Salesforce object type. Possible values: account (for Salesforce's "Person Account"), contact, lead

primaryId

string

 

The field to be used as the primary ID for matching Gigya users to Salesforce person accounts/contacts/leads.

lookupId

string

 

The field to be used as an index for the “upsert” process.

 datasource.write.sftp

Load files to an SFTP server.

RequiredParameter NameTypeDefaultDescription

hoststring The remote SFTP host.

usernamestring The remote SFTP username.

(Provide 1 of these parameters)

passwordstring The remote SFTP password.
privateKeystring The base64 private key to use upon ssh keys authentication
portinteger22

The remote SFTP port.

remotePathstringSFTP user home directory

The remote SFTP directory to store files at.

passphrasestring The private key passphrase.
temporaryUploadExtensionBooleanfalseSet to "true" to append the extension ".uploading" to the file name while it is being uploaded. This extension will be removed when the uploading process is finished.

Field

field scripts modify the dataset in various ways after it is extracted from the source platform.

 field.array.concat

Turn an array field into a string by concatenating all the array elements.

Required Parameter NameTypeDefaultDescription

fieldstring The name of the field to transform.
separatorstring,

The separator used for delimiting the array elements.

Example

Data BEFORE

dataflow step 

Data AFTER

{
"categories":[
"Musician/band",
"Company"
]
}
{
"id":"...",
"type":"field.concat",
"params":{
"field":"categories",
"separator":", "
}
}
{
"categories":"Musician/band, Company"
}

 field.array.extract

Produce multiple records from an array field - a record for each element of the array.

Required Parameter NameTypeDefaultDescription

field

string

 

The field to normalize.

generateKey

Boolean

true

Whether to generate a unique identifier for each element of the array.

propagationFields

array

 

An array of strings containing one or more fields to add to each array entry

Example

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"email":"tammy@torres.com",
"phones":[
{
"type":"mobile",
"number":"0521111111"
},
{
"type":"work",
"number":"0542221111"
}
]
}
}
{
  "id": "...",
  "type": "field.array.extract",
  "params": {
    "field": "profile.phones",
    "propagationFields": [
"profile.email"
] }, "next": [ "..." ] }

email

type

number

tammy@torres.com

mobile

0521111111

tammy@torres.com

work

0542221111

 field.copy

Duplicate one or more fields. For every source field, this script creates a target field with a new name, and populates it with values from the source field.

Required Parameter NameTypeDefaultDescription

fieldsarray 

An array containing one or more pairs of fields in the format shown below.

[
    {
		"sourceField":"<source field name>"
		"targetField":"<target field name>"
	},
	...
 ]

 field.date.format

Convert a field value from a given date/time format to another date/time format.

Required Parameter NameTypeDefaultDescription

fieldsarray 

An array containing field(s) to change and the source format and desired target format for each field, as follows:

[
    {
		"field":"<field name>",
		"sourceFormat":"<source format>"
		"targetFormat":"<desired format>"
	},
	...
 ]

For example:

[
	{
		"field": "LAST_CONNECTION_DATE",
		"sourceFormat": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",
		"targetFormat": "dd/MM/yyyy HH:mm"
	},
	{
		"field": "SUBSCRIPTION_DATE",
		"sourceFormat": "yyyy-MM-dd'T'HH:mm:ss.SSS'Z'",
		"targetFormat": "dd/MM/yyyy"
	}
 ]

To specify the source and target formats, use DateTimeFormatter patterns.

 field.evaluate

Evaluate fields using JEXL (See JEXL documentation and JEXL Syntax Overview) or jsonpath (see documentation), and assigns the result to other fields.

This script is especially useful for handling various "null" values. See second example below.

Required Parameter NameTypeDefaultDescription

fieldsObject Array 

An object array where each object includes:

  • expression (string) - the JEXL or jsonpath expression

    Note that in the expression, field names should be spelled with an underscore _ replacing the period. For example, profile.firstName becomes profile_firstName. See also example below.

  • field (string) - the name of the field in which to place the result of the expression

For example, the following parameter value will set the value of the profile.gender field to Male, Female or Unspecified when receiving m, f or null, accordingly.

[
    {
       "field":"profile.gender",
       "expression":"profile_gender eq 'm' ? 'Male' : (profile_gender eq 'f' ? 'Female' : 'Unspecified')"
    }
]

This is an example of adding a new hierarchical field, using JEXL:

[
	{
            "field": "data.newsletters.textbreakingnews",
            "expression": "this"
	}
]

languagestring The language used for evaluation. Supported values: jsonpath, jexl

Example

To extract a specific phone type, you can use the following jsonpath:

{
    "id": "...",
    "type": "field.evaluate",
    "params": {
        "fields": [
            {
                "field": "PersonHomePhone",
                "expression": "$.profile[?(@.phones[?(@.type=='LandlineNumber')])].number"
            },
            {
                "field": "PersonMobilePhone",
                "expression": "$.profile[?(@.phones[?(@.type=='MobileNumber')])].number"
            }
        ],
        "language": "jsonpath"
    },
    "next": [
        "..."
    ]
},

Example - Passing "Null"

 The following sample code demonstrates transforming null values from Gigya into "N/A" strings, which is the acceptable format in Salesforce for nulls.

 {
            "id": "clean", // Any recognizable name you choose for this step. 
            "type": "field.evaluate",
            "params": {
                "fields": [
                    {
                        "field": "Golf_Handicap_Number__pc",
                        "expression": "Golf_Handicap_Number__pc eq '' || Golf_Handicap_Number__pc eq null ? '#N/A' : Golf_Handicap_Number__pc"
                    }
                ],
                "language": "jexl"
            },
            "next": [
                "..." // Specify the next step in the dataflow
            ]
        },

 field.flatten

Move the children of a specified field or fields one level up in the hierarchy.

Required Parameter NameTypeDefaultDescription

fieldsstring array 

A string array in which each entry is a field name.

Example

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}
{
"id":"...",
"type":"field.flatten",
"params":{
fields: ["profile"]
}

}

{
"firstName":"Tammy",
"lastName":"Torres"
}

If a selected field's children have their own children, these will not be flattened.

 field.move

Move a field or fields to another location, i.e. change the order of columns in the table.

Required Parameter NameTypeDefaultDescription

fieldsarray An array of strings, containing one or more fields to be moved.

afterstring The name of the field after which the original fields will be placed (the fields will be placed in the order in which they are listed in the fields parameter).

 field.remove

 Remove a field from the dataset.

Required Parameter NameTypeDefaultDescription

fieldsarray An array of strings, containing one or more fields to be removed.

 field.rename

Rename one or more fields. The fields can be simple field types or objects.

Required Parameter NameTypeDefaultDescription

fieldsArray 

An array of objects where each object contains a "sourceField" property and a "targetField" property.

See examples below.

Examples

Example #1: Renaming an object

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}
{
"id":"...",
"type":"field.rename",
"params":{
"fields":[
{
"sourceField":"profile",
"targetField":"My Profile"
 }
]
}
}
{
"My Profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}

Example #2: Renaming fields inside an object in a way that flattens the hierarchy

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres"
}
}
{
"id":"...",
"type":"field.rename",
"params":{
"fields":[ { "sourceField":"profile.firstName", "targetField":"First Name" }, { "sourceField":"profile.lastName", "targetField":"Last Name" }
]
} }
{
"First Name":"Tammy",
"Last Name":"Torres"
}

Example #3: Renaming a nested array entry attribute such as "profile.phones.number". In this case the transformer will extract the number attribute from each phone entry inside the phones array, resulting an array of number attributes.

Data BEFORE

Dataflow Step 

Data AFTER

{
"profile":{
"firstName":"Tammy",
"lastName":"Torres",
"gender":"f",
"phones":[
{
"type":"mobile",
"number":"0521111111"
},
{
"type":"work",
"number":"0542221111"
}
]
}
}
{
"id":"...",
"type":"field.rename",
"params":{
"fields":[ { "sourceField":"profile.phones.number", "targetField":"Phone Numbers" }
]
}
}
{
"profile":{
"firstName":"Tammy",
"lastName":"Torres",
"gender":"f",
"phones":[
{
"type":"mobile"
},
{
"type":"work"
}
]
},
"Phone Numbers":[
"0521111111",
"0542221111"
]
}

 field.replace

Find and replace strings within field values using regex.

Required Parameter NameTypeDefaultDescription
fieldsarray 

An array of field objects in which to replace strings. Contains the following properties:

  • "field": A field name, or a wildcard "*" to find and replace strings in all the fields.
  • "regex": the regex pattern to be replaced.
  • "replacement": the new string that replaces the original string.

Example

Execute the following steps to replace the "\n" line break with a space, and to remove all non-ASCII characters, in case the target platform requires such a format:

{  
       "id":"...",
       "type":"field.replace",
       "params":{  
          "fields":[  
             {  
                "field":"*",
                "regex":"(\\|)|(\n)",
                "replacement":" "
             },
             {  
                "field":"*",
                "regex":"[^\\x00-\\x7F]",
                "replacement":""
             }
          ]
       },
       "next":[  
          "..."
       ]
    },

File

file scripts determine the format, file compression, and other properties of the output of the dataflow, before it is uploaded to the target platform.

 file.compress.gzip

Apply GZIP compression to files.

This script has no parameters.

 file.compress.lzo

Apply LZO compression to files.

Required Parameter NameTypeDefaultDescription
createIndexFileBooleanfalse

Whether to create an additional index file.

 file.compress.zip

Apply ZIP compression to files. 

This script has no parameters.

 file.empty

Create an empty file with a configurable name.

Required Parameter NameTypeDefaultDescription

fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started)
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv

.

Example

 Using the fileName parameter to set the name of the file:

{  
         "id":"dsv",
         "type":"file.format.dsv",
         "params":{  
            "fileName":"linda_${now}.csv",
            "columnSeparator":",",
            "quoteFields":true
         },         
      },

 file.decrypt.pgp

Decrypt PGP-encrypted data.

This script has no parameters. When a new dataflow is created that uses file.decrypt.pgp, the ETL service automatically generates a PGP public key. This public key can be viewed by calling the idx.getDataflow API method to view the new dataflow.

Usage Instructions

Sample Dataflow

The following dataflow is an example of a dataflow with file.decrypt.pgp with a PGP public key that was generated automatically. The value of pgpKey.publicKey should be given to the customer so that they can encrypt their data with it.

 

{  
   "name":"Nestle",
   "description":"Nestle Inbound flow include PGP decryption",
   "steps":[  
      {  
         "id":"sftp",
         "type":"datasource.read.sftp",
         "params":{  
            "host":"localhost",
            "username":"abcd",
            "password":"1234",
            "remotePath":"inbound/nestle",
            "fileNameRegex":"NESTLE_.*.pgp"
         },
         "next":[  
            "pgp"
         ]
      },
      {  
         "id":"pgp",
         "type":"file.decrypt.pgp",
         "params":{  
            "pgpKey.publicKey":"73lkkjjk34j4j4k3h5h5435l2542lbj452g3kj5g43h25g2k35g4h5j4g5435345kgh53j3g454gj5h43jh5g43534f2",
         },
         "next":[  
            "csv"
         ]
      },
      {  
         "id":"csv",
         "type":"file.parse.dsv",
         "params":{  
            "columnSeparator":","
         },
         "next":[  
            "rename"
         ]
      },
      {  
         "id":"rename",
         "type":"field.rename",
         "params":{  
            "fields":[  
               {  
                  "sourceField":"EVENING_ABC",
                  "targetField":"data.newslettersEvening.abc"
               },
               {  
                  "sourceField":"EVENING_CORREO",
                  "targetField":"data.newslettersEvening.correo"
               }
            ]
         },
         "next":[  
            "account"
         ]
      },
      {  
         "id":"account",
         "type":"datasource.write.gigya.account"
      }
   ]
}

 file.encrypt.pgp

Encrypt data using PGP encryption.

Required Parameter NameTypeDefaultDescription

publicKey

string

 

The base 64 encoding of the public key for PGP encryption.

asciiArmor
booleantrueWhether to use ASCII armor -- a binary-to-textual encoding converter which involves encasing encrypted messaging in ASCII so that they can be sent in a standard messaging format.

Usage Instructions

 

 file.format.aam

Format a dataset as a text file in the format required by the AAM platform.

The file name will be ftp_dpm_<dpid>_<currentJobStartTime>.sync

For example: ftp_dpm_31642_756325456.sync

Required Parameter NameTypeDefaultDescription

dpidstring 

The Adobe Data Provider ID.

precedingKeyBooleantrueWhether to include '-' before each key.

 file.format.count

Add a file containing a record count (row count) to each output file.

Required Parameter NameTypeDefaultDescription

fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv
createEmptyFileBooleanfalse

Whether to create a file if the row count is 0. If set to "true", a count file will be created containing the number 0. Otherwise, no count file will be created.

Example

Using the fileName parameter to set the name of the file:

{  
         "id":"dsv",
         "type":"file.format.dsv",
         "params":{  
            "fileName":"linda_${now}.csv",
            "columnSeparator":",",
            "quoteFields":true
         },         
      },

 file.format.dsv

Format a dataset as a plain-text file in DSV (delimiter-separated values) format.

 Required Parameter NameTypeDefaultDescription

columnSeparatorchar The delimiter character, such as "," or "|". This character will be used to separate field values in the DSV file.
fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv

  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv
createEmptyFileBooleanfalse

Whether to create an empty file if no data was extracted or transferred.

columnsstring array 

By default, the system assumes all rows in the source data share the same column names, and the column names are found in the first row.

If not all the rows have the same column names, supply 1 of these parameters:

  • Option #1: Set columns to a column name array to specify the column names to look up (column order will match the supplied array).
  • Option #2: Set inferColumns to true to tell the loader to auto-infer the column names and add new columns to the right

inferColumnsBoolean 
quoteFieldsBooleanfalse

Set to true to quote all the DSV entries. Default value is false - fields will get quoted when needed.

writeHeaderBooleantrueWhether to include the DSV header row in the output file. Set to "false" to create a file with no header row.
lineEndstring\nThe line break used in the DSV file.

Example

Using the fileName parameter to set the name of the file:

   {  
         "id":"dsv",
         "type":"file.format.dsv",
         "params":{  
            "fileName":"linda_${now}.csv",
            "columnSeparator":",",
            "quoteFields":true
         },         
      },

 file.format.json

 Format a dataset as a JSON object.

 RequiredParameter NameTypeDefaultDescription
fileNameString 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv
createEmptyFileBooleanfalse

Whether to create an empty file if no data was extracted or transferred.

Example

Using the fileName parameter to set the name of the file: 

{  
         "id":"json",
         "type":"file.format.json",
         "params":{  
            "fileName":"gigya_${now}.json",           
         },         
      },

 file.format.krux

Format the data as a textual file in the format required for the Krux integration.

Output Format:

  • Each row in the output file contains 2 columns, delimited by the caret (^) symbol.
    • The first column represents the GUID and must match the GUID used in the user matching process. 
    • The second column contains all the data associated with the user and should be in the following format: "":"<Attribute_Value_1>","":"<Attribute_Value_2>", ...
  • Each row can contain an arbitrary number of attribute value combinations.
  • It is not mandatory for values to be specified for all attributes for a given user.

For example, if there are two columns called Age (representing age group) and Gender in the client's registration database, that need to be imported into the Krux platform, then the following represents a valid data file that can be ingested by Krux:

User1234^"gender":"male","age":"24"
User2345^"gender":"female"
User3456^"age":"35-44"

Required

  Parameter NameTypeDefaultDescription

fileNamestring 

The format of the name of the file created. The name can include a fixed string or placeholders, denoted by a dollar sign "$" followed by curly brackets: ${variableName}. A range of different time stamp formats are supported. Time stamp conventions (e.g., YYYY) are also supported. The possible placeholders are:

  • apiKey
  • jobID
  • now (the current time when the job was started )
  • now-xD (the current time when the job was started , minus "x" number of days)
  • now+xD (the current time when the job was started , plus "x" number of days)
  • now+xD:yyMMdd
  • unix

Examples:

  • String fileNameTemplate = "gigya_export_${jobId}_${now:yyyy}.csv"; yielding: gigya_export_de46b1bf476a42c19497725f8a0d6a5f_2016.csv
  • String fileNameTemplate = "gigya_export_${now:yyyy}_${now-5D:yyMMdd}.csv"; yielding: gigya_export_2016_160813.csv
createEmptyFileBooleanfalse

Whether to create an empty file if no data was extracted or transferred.

Example

 Using the fileName parameter to set the name of the file:

{  
         "id":"dsv",
         "type":"file.format.dsv",
         "params":{  
            "fileName":"linda_${now}.csv",
            "columnSeparator":",",
            "quoteFields":true
         },         
      },

 file.parse.dsv

Parse files in DSV format. 

RequiredParameter NameTypeDefaultDescription
columnSeparatorstring 

The separator between each column.

inferTypesBooleantrueWhen set to true, an attempt will be made to parse the value in string fields to Boolean, long or double (in that order)

 file.parse.json

Parse files in JSON format.

This script has no parameters.

 file.uncompress.gzip

Extracts files that were compressed using GZIP compression.

This script has no parameters.

 file.uncompress.lzo

Extracts files that were compressed using LZO compression.

This script has no parameters.

 file.uncompress.zip

Extracts files that were compressed using ZIP compression. 

This script has no parameters.

 

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save


  • No labels