The integration with Google Cloud Storage using IdentitySync supports both directions of data transfer (from Google Cloud into SAP Customer Data Cloud, and from SAP Customer Data Cloud into Google Cloud). If you are not familiar with IdentitySync, the SAP Customer Data Cloud tool for data transfers, we recommend you read the documentation and familiarize yourself with the system before setting up this configuration.
Download the Service Account File
- In the Google Cloud console menu, select APIs and Services > Credentials.
- Select Create credentials > Service account key:
- From the Service accounts list, select the relevant account, or choose New service account.
- Enter the new account name.
- From the Role list, select Project > Owner (or other relevant permission).
- Click Create.
A JSON file that contains your key is downloaded to your computer. Use these credentials in the next step.
Set Up the Dataflow
IdentitySync is based on scheduled data flows, which are built of a series of preconfigured steps for extracting the data from the source database, transforming it as needed and loading it into the target database.
The following guide outlines the general steps for configuring an IdentitySync dataflow. For integrations that use Google Cloud, you will require the following components:
To create an integration based on IdentitySync, complete the following process:
1. Create Data Flow
Open IdentitySync Data Flows in Gigya's Console. Make sure your are signed in and have selected the relevant site. The IdentitySync dashboard may also be accessed by clicking Settings in the upper menu and then IdentitySync Data Flows in the left menu.
In the dashboard, click Create Data Flow.
In the Create Data Flow window, select the data flow integration from the dropdown. If the flow you wish to create is not available in the dropdown, select any available flow: it is customized in the next steps.
Select the data flow template: the direction of the flow, whether from or into Gigya. Note that at the bottom of this window, you can see an outline of the flow that will be created (e.g., Account > rename > dsv > gzip >sftp).
Click Continue. As a result, the IdentitySync Studio screen opens in the dashboard.
2. Edit the Data Flow
The data flow you created is built of the required steps for data transfer between Gigya and the selected vendor. Use the Component Repository to understand the structure and parameters required in each step.
Using IdentitySync Studio, you can:
- Specify passwords, IDs, API keys etc. required for accessing each system and customer database.
- Add the names of fields included in the data flow.
- Flatten fields, remove non-ASCII strings, specify the compression type, parse in JSON or DSV format, etc.
- Map fields and extract array data, for example using field.array.extract.
- Change the name of the data flow.
- Split a data flow, for example if you want to create two duplicate files and upload each file into a different destination. To do so, simply drag and drop the relevant step into the flow, and add connecting arrows as needed. In the code for the flow, this will be expressed in the next attribute, where you will find reference to the next two steps rather than just one. For a sample dataflow which employs this method, see the Epsilon Dataflow.
- Add Custom Scripts.
- Write failed records, that did not complete the flow successfully, to a separate file for review.
The following screenshots include example screenshots for implementing IdentitySync flows. Your actual implementation may require using different components from the ones shown.
To do so:
- If it's more convenient, you can work in full screen mode by clicking the full-screen toggle on the top rigt corner.
- Double-click any of the steps to add or edit its parameters. Click OK when finished.
- To add a new step, start typing its name in the Search component box. Drag the step from the list of components into the canvas.
- Drag arrows from/to the new step and from/to existing steps, to include it in the correct place in the flow. Make sure the "Success path" arrow is selected, under Connector Type.
- To add a custom step, locate the record.evaluate step in the list of components and drag it to the canvas.
- To split the data flow (for example to write to two target platforms), add the relevant step (e.g. another "write" step) and draw arrows accordingly:
- Handling failed records: You can add additional steps after a "writer" step, for writing to a separate file the records that did not complete the flow successfully. To do so:
- Add the relevant components to the flow (for example, a file.format step to write the records to a file, and a writer to write the file to the relevant destination).
- Under Connector Type, select the "Error path" connector.
Draw a connection from the original writer, to which successful records will be written, to the next step that handles failed records (e.g., the file.format step).
Under Connector Type, select the "Success path".
Connect the next steps that handle the failed records (e.g., the writer) using the "Successful path" connector.
- Delete a step by selecting it and hitting the Delete button on your keyboard.
- If necessary, click Source to review the data flow code , and edit the code as needed.
- Click Save.
Your dashboard should now look something like this:
The following actions are available:
|Edit||Opens the current data flow in IdentitySync Studio and change any of its attributes, steps and parameters.|
|Run Test||Runs the data flow once on 10 records for test purposes. If the test was successful, after refreshing the dashboard, you will see the timestamp of the test run under Last Successful Run. Use the Status button to view the details of the run. See Job History section on this page.|
|Duplicate||Useful for creating a new data flow based on a flow which has already been customized, if you wish to create a similar flow with slight variations.|
|Status||Displays the status of the current jobs running in your IdentitySync configuration. See Job History section on this page.|
|Delete||Deletes this data flow.|
3. Schedule the Dataflow
- Under Actions, click (schedule) to open the Scheduler.
- Click Create Schedule.
- Configure the schedule:
- Enter a schedule name
- Change the start time as needed
- Choose whether to run once or at scheduled intervals
- "Pull all records" should usually be selected only in the first run, when migrating records from one database to the other, and in any case should be used with caution. If the checkbox is not selected, and this is the first time this dataflow is run, records will be pulled according to the following logic:
- If the dataflow is set to run once, all records from the last 24 hours will be pulled.
- If the dataflow is recurring, records will be pulled according to the defined frequency. For example, if the dataflow is set to run once a week, the first time it is run, it will pull all records from the previous week.
- (Optional) Enter the email adress(es) for success and failure of the dataflow run.
- (Optional) Limit to a specific number of records. This is usually used for test runs: when running a test from the dashboard, a one-time schedule is created which runs immediately for 10 records.
- Click Create, and, once you are back in the Schedule dashboard, click the Refresh button.
- The status of the scheduling is indicated in the Status column.
- You can stop a job mid-run by clicking the Stop icon under Actions: