Oracle Integration Cloud - insights and analytics with Splunk
Sharing some interesting insights that I obtained from runtime information of integration flows in Oracle Integration Cloud - extracted into Splunk.
My approach was to extract the activity trace already available in OIC, slightly reformat the data using python/shell scripts, ingest them into Splunk and create these visualisations. I will not go into too many technical details here but enough to show what's possible.
I should add that Oracle Management Cloud (especially Log Analytics) might have something similar, but as of this writing, I am not aware of anything similar that might be pre-built specifically for Oracle Integration Cloud.
OIC's own monitoring dashboard provides pre-configured insights at a high level, and of course, one can drill down into individual runtime instances to identify problems. But my objective here is to create something more flexible and dynamic that could be extended and more importantly, could be used to analyse historical trends. This is valuable when the system is gradually improved over time.
Example 1: Top errors in a 24 hour period
errorLocation shows the activity names. Other fields (like integration name) can be added to the grouping, but I have not done this for simplicity and because, in my test-bed setup these activity names are unique.
Example 2: Activity performance trends
This can be used to identify any unusual spikes in the execution time of an activity. Execution times are found in the OIC activity trace, but only for some fleeting minutes. In the example below, the normal execution time most commonly seen for activity PriceRequestService is ~500ms but at one particular time, a spike is noticed that takes the execution time to 1700ms. Any momentary spikes can go unnoticed and might not be serious if one-off, but they could have some deep underlying root cause.
The screenshot below gives another example of a spike in execution time that would be worth investigating if it occurs in a production system:
The data shown in tables can also be visualised as below. Visualisations can be customised via different filters and parameters to identify the most relevant insights
c) Example 3: Top activities by average execution time
The data that we gather and index can be aggregate, sliced, diced, filtered and sorted. We are able to generate this visualisation that presents activities sorted by highest execution time
Example 4: Simple count of instances within a time interval
The insights presented here are just examples. A number of interesting possibilities can emerge.
One can use these insights to automatically send alerts on certain conditions (e.g. when execution time or average execution time of an activity exceeds a certain threshold, error count on a certain activity exceeds threshold, no instance of a particular integration found within a certain interval)
The example insights below are useful in the context of performance tuning and error correction. An important pre-requisite for performance tuning is to identify what is it that we aim to improve and how we are going to measure it before and after improvement. This is purely runtime technical data and does not deal with any business data (but it is possible to build similar or more advanced solutions on business data as well!)