SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Schedule flow by file event and file name

Reply
Occasional Contributor
Posts: 15

Schedule flow by file event and file name

Hi all.

I would ask you for an advice on how to better dial with the following problem...

We have a list of DI jobs that need to be schedule on the LSF, via the management console.

We crate the flow of jobs, but have this contraint.

One flow should start when a new file arrive on the folder.

If multiple file arrives, the flow must run for every file.

Each job in the flow must know the name (or some other information could also be good...) of the file that make the flow starts.

The problem is that we do not understand how to deal with this last step... Any advice? Smiley Happy

Super User
Posts: 5,254

Re: Schedule flow by file event and file name

I don't about the exact possibilities that you have in LSF, but things like this can be handled within DI Studio by using the Loop transformation and parameterized jobs.
Data never sleeps
Contributor
Posts: 22

Re: Schedule flow by file event and file name

[ Edited ]

Hi Luke,

 

Dependencies/Sequencing Issues Between Files

 

You could create multiple clones of the jobs in DI if they are different files and if their is a dependency between the files. And then run them as seperate flows.

 

All Files Being Ran Through The Same Process

 

Otherwise you can stipulate the flow to start by saying "run when any of the conditions occur while scheduling in management console." This should  pick up any files in the folder at that current time.

 

i.e. if you have multiple files that can trigger it will be set off by any one of the conditions being met.

 

Note: Would not hard code the file names in Jobs in DI, I would create a macro variable as the file name and loop through this as precode in the job to check if they exist. Then as Linas said loop the job in DI. This would be the most concrete solution.

 

 

Occasional Contributor
Posts: 15

Re: Schedule flow by file event and file name

[ Edited ]

Hi all, and thanks for the suggestion. In our case, the cloning of the jobs is not an opinion, since the file that arriver are possibly infinite. The workflow is something like: user upload one or more files on the server, and the process start to validate and manipulate these files. Each file validation process is managed by a large set of jobs, that process each file for different steps of jobs. Eache set of job could fail, then the following steps should not be processed.

Things is a bit more complex, since many user can load different set of file, that should be processed by different flows. So, I'm wondering if there is any way, from SASMC, to pass parameters from a job to another, or maybe to make available to a job, or a list of jobs, of the file name that trigger the event that start the flow!

 

Thanks!

Contributor
Posts: 22

Re: Schedule flow by file event and file name

You could have an initial job which runs a System command to pick up the filenames and insert them into a sas array or sas variables. This would be essentially be polling any of the files. This job would then also generate the triggers required to run the specific flows.

 

Then dependant on the inputs, triggers for each of the flows are generated and using a system command it will output them into the required directory(s). And each flow could be triggered based on what's present in a folder at a given time.

 

That seems the most logical for what you require at the moment.

Ask a Question
Discussion stats
  • 4 replies
  • 359 views
  • 0 likes
  • 3 in conversation