(Read the previous trick, Rolling Sums & Averages Grouped By Key)
Informatica provides both a Web Service Consumer transformation and an HTTP transformation that can access a remote webpage and return the information provided (these days, usually in the form of JSON files). When the web service requires input parameters, Informatica passes them through with standard GET method (e.g. ““).
Unfortunately, RESTful web services do not use standard GET parameters. Instead, a RESTful web service has a URI, i.e. a regular URL followed by a URN that identifies a particular resource, and which acts as an input parameter of sorts (e.g. “”). Informatica currently has no way of assembling the URI on the fly; however, this limitation can be overcome. This will explain how.
In a recent Smartbridge project, this issue arose when our client wanted to obtain information about their business from an external independent source that offered the information through a RESTful web service. The URN had two parts: a start date and an end date (plus a few other identifiers not relevant to this post).
Due to business requirements and the nature of the data being served, it was decided that we wanted the ETL to fully refresh the last week’s worth of data daily, and perform a check for new records (but not perform updates) for the entirety of the last month. This being JSON files, we would have to feed the output from the web service into a Java transformation for processing, and therefore the Java memory stack was also a concern (but that is a blog post for another day). Bottom line, we wanted to make sure we could slice and dice the dates depending on the circumstances.
The PowerCenter Approach
Ultimately, the best approach is to use the HTTP GET transformation with a $$VariableURL already crafted for the variable values needed. You will be required to provide at least one parameter, but if you feed it a NULL value, the resulting URL will be valid. Remember that PowerCenter does not update mapping variables’ values until session end, so the $$VariableURL needs to be set to the needed value before the session in invoked – therefore, it must be a Workflow variable, or a parameter file variable. We decided on a Workflow variable, but the approach will work with a mapping parameter file generated by a previous session of the workflow.
1. Create a $$VariableURL_wf workflow variable. You can give it an initial value if you want, but it will not be used. We placed the base URL there for safekeeping.
2. In the workflow, precede each session that will use the web service with an Assignment Task that generates the correct URL for that session and assigns it to $$VariableURL_wf. As mentioned above, in our case the variability of the URI was exclusively time-dependent. If your case relies instead on values read from a data source, I strongly recommend using a parameter file instead.
The resulting variable will be something like:
3. Assign the $$VariableURL_wf to $$VariableURL_m in the session, so that it is available in the mapping
4. In the mapping(s), have the HTTP transformation use the $$VariableURL_m
By controlling the variable at the workflow level, this approach does lose much of the flexibility you would get from a regular HTTP GET transformation (where you can alter the variable at mapping runtime), but will nevertheless get the job done. It is possible that Informatica will soon patch the HTTP transformation to account for RESTful web services, but if so, it is likely at this point it will only happen in version 10.
On the other hand, giving each variable value its own mapping creates an enforced parallelism to the HTTP calls that can (and in our case definitely did) increase the throughput. It was so successful, in fact, we parallelized several other, more traditional, HTTP GET calls, and saw a significant increase in our overall workflow speed.