As organizations increasingly adopt cloud-based infrastructure, the need to seamlessly connect various cloud services becomes paramount. If you are working with IBM Planning Analytics as a Service (IBM PA SaaS) V12, you have likely encountered scenarios where accessing cloud storage providers would streamline your data workflows. Today, I will walk you through integrating Google Cloud Storage (GCS) with IBM PA SaaS using OAuth 2.0 authentication.
Cloud storage integration enables IBM PA SaaS to directly access files stored in Google buckets, eliminating manual file transfers and enabling automated data pipelines. This integration is particularly valuable for organizations that store source data in GCS and need to feed it into their IBM Planning Analytics models.
By using a Google Cloud service account rather than individual user credentials, this approach provides several key advantages:
This integration uses a Google Cloud service account for authentication, not individual user credentials. Service accounts are designed for server-to-server interactions and provide a more secure, scalable approach than using personal credentials. They allow your PA SaaS instance to authenticate independently without requiring human intervention.
Google Cloud Storage uses OAuth 2.0 for service account authentication, which involves a two-step handshake:
Once authenticated with the service account token, you can use it to retrieve file lists and download data directly into IBM PA SaaS.
For this integration, we will leverage four key IBM PA SaaS functions:
A JWT token for Google requires three components: a header, a payload, and a signing key.
The header specifies the algorithm and token type. Using IBM PA SaaS's JsonAdd function, we can construct this programmatically:
sHeader = JsonAdd('{}', 'alg', 'RS256');
sHeader = JsonAdd(sHeader, 'typ', 'JWT');
This generates:
{
"alg": "RS256",
"typ": "JWT"
}
The payload contains the authentication claims that Google needs to verify your service account:
Note that both sub and iss use the service account email, not a user's email address. This ensures the authentication is tied to the service account rather than any individual user.
Here is how to build it:
sPayload = JsonAdd('{}', 'sub', StringtoJson('<service account email>'));
sPayload = JsonAdd(sPayload, 'aud', StringtoJson('https://oauth2.googleapis.com/token'));
sPayload = JsonAdd(sPayload, 'iss', StringtoJson('<service account email>'));
sPayload = JsonAdd(sPayload, 'iat', 0);
sPayload = JsonAdd(sPayload, 'exp', 3600);
sPayload = JsonAdd(sPayload, 'scope', StringtoJson('https://www.googleapis.com/auth/devstorage.read_write'));
Notice the iat and exp values? By setting iat to zero, IBM PA SaaS's JwtCreate function automatically substitutes the current timestamp. The exp value of 3600 adds one hour to that timestamp, giving the token a one-hour lifespan.
The JWT requires your service account's private key for signing. When you create a service account in Google Cloud, you can download its private key as a JSON file. For security, store this key in an encrypted format within a control cube, then retrieve it using CellGetS and Base64DecodeOutput when needed. This approach keeps sensitive service account credentials secure while making them accessible to your automated processes.
Figure 1: Control Cube: API Control
With all three components ready, we can generate the JWT and exchange it for an access token:
#Region --------------- Create JWT Token ---------------
sJWT = JwtCreate(sHeader, sPayload, sKey);
#EndRegion
#Region --------------- Authenticate with Google using JWT Token ---------------
sTokenURL = 'https://oauth2.googleapis.com/token';
sRequestBody = 'grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer&assertion=' | sJWT;
sHeaders = '"Content-Type":"application/x-www-form-urlencoded"';
ExecuteHttpRequest(
'POST',
sTokenURL,
'-h ' | sHeaders,
'-d ' | sRequestBody
);
sResponse = HttpResponseGetBody();
# Extract and store the access token
sToken = JsonGet(sResponse, 'access_token');
sToken = SUBST(sToken, 2, LONG(sToken) - 2);
CellPutS(sToken, '}System - API Control', 'Token', 'String');
#EndRegion
The access token is extracted from the response and stored in a control cube for subsequent requests.
Now comes the payoff: accessing your GCS bucket and downloading files.
First, retrieve the list of files in your bucket:
#Region --------------- Retrieve List of Available Files ---------------
sBucketName = '<GCS Bucket Name>';
sListFilesURL = 'https://storage.googleapis.com/storage/v1/b/' | sBucketName | '/o';
ExecuteHttpRequest('GET', sListFilesURL, '-h Authorization: Bearer ' | sToken, '-o ' | cFile);
#EndRegion
The -o parameter tells IBM PA SaaS to output the response to a file, creating a JSON document containing all available files in the bucket.
Configure your TurboIntegrator process to read the JSON output:
DatasourceType = 'JSON';
DatasourceJsonRootPointer = '/items';
DatasourceJsonVariableMapping = JsonAdd('{}', 'vName', StringToJSON('/name'));
DatasourceJsonVariableMapping = JsonAdd(DatasourceJsonVariableMapping, 'vFile', StringToJSON('/mediaLink'));
DatasourceNameForServer = cFile;
This setup extracts file names and download URLs from the "items" array in the JSON response.
Figure 2: Data source for JSON TI Process
In the Data section of your TI process, download each file:
# Clean up mediaLink value
sURL = vFile;
IF(SUBST(sURL, 1, 1) @= '"');
sURL = SUBST(sURL, 2, LONG(sURL) - 2);
ENDIF;
# Clean up filename value
sFileName = vName;
IF(SUBST(sFileName, 1, 1) @= '"');
sFileName = SUBST(sFileName, 2, LONG(sFileName) - 2);
ENDIF;
# Download the file
sAuthHeader = '-h Authorization: Bearer ' | sToken;
sOutputFile = '-o Data/Import/Google/' | sFileName;
ExecuteHttpRequest('GET', sURL, sAuthHeader, sOutputFile);
Each file is downloaded directly to your specified location within IBM PA SaaS, ready for processing.
This integration demonstrates the power of IBM PA SaaS's HTTP and JSON capabilities. By leveraging OAuth 2.0 authentication with JWT tokens, you can create secure, automated connections to Google Cloud Storage without manual intervention.
The pattern established here can be adapted for other cloud storage providers that use similar authentication schemes, making your IBM PA SaaS environment more flexible and cloud native.
To implement this integration in your environment, you will need:
With these components in place, you can automate file retrieval from Google Cloud Storage using secure service account authentication, enabling fully automated data integration pipelines within IBM Planning Analytics as a Service.