03-13-2014 07:12 AM
I hope someone can advise me on this.
We have a number of SAS flows which are triggered by file drops. However if the files do not arrive (for any reason) the flows continue to run. As the flows are daily there is the potential for the same flow to run more then once at the same time this is dangerous for our data.
Does anyone know how we can schedule flows and incorporate some kind of timeout that would result in the flow ending after a specific time period in the event of the trigger files not arriving?
03-13-2014 07:46 PM
SAS supplies a tool - LSF - to cover situations like this.
If you are restricted to base SAS and can use a server (eg windows server) scheduler you need to write a short control program that will run at specified intervals and test for the existence of your file drops - fileexist() - hopefully your applications delete the previous files once they have been processed. Then when all present and correct your control program can %include the flows and perhaps set a flag in a persistent text file showing you are all done for the day and preventing further runs of the flows.
03-15-2014 08:07 AM
Any reasonable scheduler will be able to monitor files and folders for events.
For Windows Task scheduler: You could implement an event driven process looking for a "file create" event (your trigger file written into the specified folder location) and then trigger a little wrapper job which batch submits your SAS jobs.
Google for something like "event trigger file" together with the name of the scheduler you use. This should give you instruction pages of how to set up something like this.
03-21-2014 10:02 AM
Thank you for the feedback, I don't think my question was specific enough.
We are using D.I. Studio to create the SAS jobs and Management Console 9.2 to schedule the flows, the flows are monitored using Platfom Computings flow manager.
03-21-2014 08:41 PM
May be I'm misunderstanding something here but isn't it LSF which watches defined directories for defined events (like arrival of a trigger file)? And if an event occurs then it's LSF which triggers the flow?
If so then the flow as such is not already running but just scheduled to run. I would have thought a flow will get executed as many times as a trigger condition is true. So if on day 1 no trigger file arrives the flow simply doesn't get executed and on day 2 when a trigger file arrives the flow will get executed (once per trigger file).
Is your problem that you could get 2 trigger files on the same day but you want the flow to be executed only once a day? Or is your scheduling process more complex and you're putting the flows already into a queue on hold and the file trigger event then activates the queue?
03-22-2014 05:51 AM
I'm first to say that the documentation around this topic is quite imperfect. Been a while since I used LSF, but I think Peter is correct, you can set up file dependencies from SMC.
Also there are some information in the Help section within the product.
03-24-2014 06:46 AM
Hi Patrick, thank you for the reply. I will define steps to make things clearer.
However, if the files do not arrive at the drop location the flow maintains a status of 'Running' at 6am the next day the following days flow will trigger at 6 am.
Now there are 2 flows (today's and yesterdays) both with a status of 'running'.
My worry is that when the files arrive both of these flows will run simultaneously and chew up the data.
Thanks for the help,
03-24-2014 08:17 AM
It sounds like you have individual triggers on each of the 16 input file (if not, I think you should have) and an AND condition in LSF which requires all conditions to be met. If you could insert a step after the AND condition is met that creates a trigger for the rest of the process if and only if the the date is the same as the initial date of the process (which could be a macro variable set by a step triggered by the 6am start) then this would ensure that the process would halt if all the trigger conditions were not met by midnight, and depending on your setup, move the input files out of the way. The next day a fresh process would be initiated un-compromised by files left over from the previous day. The halted processes would need to be killed but they would provide diagnostic information.