Re: RTDM integration with IBM MQ client

sarunast · Posted 01-24-2017 03:54 PM

Dear SAS Experts,

I need an advice for the best practice how to integrate IBM MQ client with SAS RTDM. Customer activities on the e-channels will be collected by Celebrus software and if event happens it will trigger Celebrus to send a message to the RTDM throught the IBM Websphere MQ service. On the back-end the message will end up in the IBM MQ client. What is the best practise to read these messages and transform them into REST/JSON calls to the RTDM ? And then to transform the response from RTDM into message for the MQ client? Many thanks.

Dmitry_Alergant · Posted 01-25-2017 06:26 AM

Hi,

There is no native out-of-the-box support for IBM MQ with SAS RTDM itself.

In order to implement such integration, an integration adapter needs to be developed as a separate software piece which will stand in between, and join these two interfaces.

This adapter can be implemented in a number of ways, including but not limited to:

WebSphere ESB service (or a service on any other ESB, but i'm assuming WebSphere since your message transport is theirs)
Custom developed Java or Python (or any other general purpose programming language) application, in a standalone or web application deployment models.
SAS DI batch job (or essentially a SAS Base script), only for situations where small batch processing is needed, not real-time.
A SAS Event Stream Processing streaming model, if you happen to have it licensed (which is rare), and if you are implementing the latest version of RTDM 6.5

A choice between these technologies requires an in-depth conversation with respect to your requirements, IT landscape, IT strategy / attitude, available licensing, etc. No single silver bullet exists for all situations.

It is actually a very common situation that the environment dictates to implement such adapter. Depending on the environment and requirements, sometimes an adapter is extended with additional functionality like detailed request/response diagnostic logging, data enrichment, outbound cross-channel messaging, etc).

So while a general direction is clear, I won't say there are some short "best practices" to share on this level of details. It just needs to be carefully designed, developed, tested and deployed. My team has successfully done it numerous times in the past for different customers, using every technical approach outlined above.

Best regards, Dmitriy.

-------
Dmitriy Alergant, Tier One Analytics

sarunast · Posted 01-26-2017 09:30 AM

Hi Dmitriy,

thanks a lot for your comprehensive answer. I was actually thinking about possibility to mediate between IBM MQ and SAS RTDM using SAS interface to MQ and REST (through proc http). I thought if I create a windows service which indeed is a SAS session with program that contains an unlimited while loop which on one hand "listens" to the MQ client as a subscriber and calls REST API when the message appears in the queue. On another hand receives back responses and transfers them back to the queue as a publisher. But if I am right, according to your experience this would not be the efficient solution ?

Many thanks again.

Dmitry_Alergant · Posted 01-27-2017 01:46 AM

Hi,

First, i'm not sure why you decided to go with the REST API way. An alternative is the SOAP API, which may be easier to work with (in my view). We use SOAP more often then REST. I think SOAP API was implemented in RTDM way earlier then REST, so it feels like more "native" and risk-free. But that's my just my personal opinion. Both APIs are supported and supposed to work.

Second, you'll need to decide on synchronous vs asynchronous query modes (which are supported by both REST and SOAP APIs).

A synchronous mode will be much easier to implement in your script, and i'm assuming this is what you meant. But it comes with a catch - with only one concurrent SAS loop script running, you will only process one request at a time in SAS RTDM. So unless the request workload is extremely low (and can't be the case for "customer activities in e-channels"), you will quickly start lagging behind with the growing queue and requests not being processed timely.

You can probably run a pool of several dozens SAS loop script sessions in parallel to overcome this problem, but it will soon become quite a complicated solution. You'll have to maintain and monitor a parallel session pool, restart failed sessions, power-cycle sessions after a certain amount of requests for the sake of hygiene, make sure your logs don't conflict, implement logs management and cleanup (there will be huge log volumes). If you want to support graceful shutdown or restart of a session that doesn't loose any event, you will have to develop some fancy logic to do that, as simply killing SAS process will likely catch it in between of processing and losing an event.

You will want to deploy all that on at least two SAS servers if you need high-availability, and you may not have compatible SAS license to do that!

Then I'd think of more advanced challenges. Depending on your use case and business logic requirements (but not always), you may need to maintain strict in-order processing of events that belong to the same customer e-channel session. This means not to begin new request processing until the previous request of same customer session was processed and returned a response. If this is the case, you won't be able to implement it with the looped SAS Base scripts approach.

Considering all above, I would recommend against using SAS scripts to mediate for real-time request processing in the marketing domain between e-channel and SAS RTDM. I suggest you implement this mediation logic using general purpose programming languages (like Java or Python), or an Enterprise Service Bus if you have one.

Good luck!

-------
Dmitriy Alergant, Tier One Analytics

sarunast · Posted 01-27-2017 06:16 PM

Hi Dmitriy,

thanks a lot again for your comprehensive answer. I am sorry I got you a bit confused not actually writting the whole story. The whole story would be quite simillar but on the other hand a bit different. I was thinkig of a single windows service which is a SAS base code unlimited loop querying MQ client constantly for new messages. But whenever it encounters a message it actually asynchronously starts a new thread or SAS session with MP/Connect passing parameters/values to that session and leaving it to take care of the rest part - calling RTDM SOAP or REST APIs transferring a message back to MQ client as a publisher, creating an entry in the log file and terminating itself when finishes everything. The main loop can create as many concurent SAS threads or sessions as it needs. I think that the SAS session would be too slow to start therefore I was thinking about the threads. Do you think this approach could face similar challenges as you have described previously ?

Thanks a lot

Cheers

Sarunas.

Dmitry_Alergant · Posted 01-30-2017 09:47 AM

Hi Sarunas,

While your approach addresses some of the concerns that I mention, it creates a number of others. With your approach, you should be aware of the following risks and complications. Their severity depends on your real-time SLA requirements (maximum permissible delay for a single query processing) and workload (maximum number of requests per second).

Resource consumption overhead. Starting a new SAS process is a relatively heavy operation, especially if you will be connecting to any database within the script (and likely you'll want to - at least for logging purposes). Staying within SAS Base ecosystem, it will be too hard for you to implement the "thread pool" approach with sessions reuse.

Both CPU and IO resources will get under pressure. SAS does lots of logical IO operations while a session starts up (loading binaries, accessing config files, etc). Normally these operations are cached by OS and don't reach actual storage, but under high load, you never know how one or another OS behaves and when and why it can get stuck. Especially since you are running Windows.
Processing delay (you will waste precious milliseconds spawning SAS sessions) which may be not small, and also not always predictable. Have you seen (unhealthy) environments where starting up new SAS session sometimes takes 1 second, 5 seconds, 15 seconds for reasons unknown (most often some IO shortage or inefficiency of a sort)? I have seen it quite a lot.
Lack of rate control. How do you see yourself implementing rate control?

If too many requests came your way through the MQ, naturally you want to limit how many requests are processed in RTDM at the same time. You may want to only allow up to X concurrent requests processing (depending on your runtime environment sizing), while leaving others to stay in the queue until some current requests finish and free up the slot.

With SAS Scripts, especially under Windows, it is not that straightforward to implement such logic.
Maximum throughput. Your "unlimited loop" master SAS service that listens to the queue and spawns MP/Connect workers is likely to become a bottleneck, as it is single-threaded. Though I never tested it explicitly, I can imagine how it can process maybe 10 requests per second (even this needs testing), but can't imagine it processing 100 per second - this will be only 10ms per request.
Other concerns from my previous message are still valid including High Availability (you may not have sufficient SAS licenses to run SAS Base scripts on two machines), logging clutter etc.

I stand with my original point that while SAS scripts allow you to implement this integration (in principle), this is not an advisable solution to support any production workload short of extremely minimalistic.

Best regards, Dmitriy.

-------
Dmitriy Alergant, Tier One Analytics

RTDM integration with IBM MQ client