11-26-2014 08:15 AM
We are using scheduling on a growing number of sas scripts, as campaign selections etc, via the LSF/Platform scheduling software promoted by SAS.
We are able to report on the results of these jobs, whether they were successfull or not.
Now I am trying to create a report, or dashboard that is easy to access.
In that, I want to show:
-scheduled time (date and time)
-started time (date and time)
-finished time (date and time).
-if possible user/submitter.
Most information is available or retrievable thanks to the flexiness of sas ;-)
It is item 1 that I am searching for:
It is in a raw form available on our unix system, but I would like to get this information from the metadatata, as I think all this is stored in metadata also.
And it would be a cleaner process: we could use SAS to do the collecting and presenting.
We are on SAS 9.3, LSF 7.0 and Platform Manager 8.1.
Any help or suggestions are welcome.
11-26-2014 08:45 AM
As far as I know, you don't have that option in SAS 9.3. In SAS 9.4 you have advanced features, like new sasgsub, advanced Logging for all SAS Grid Components and SAS Environment Manager, that meybe I would do something with SAS Grid.
In SAS 9.3 SAS MC you have a plugin for the Grid, but I think it does not show that information you are looking for, you still need to make use of your own process.
11-26-2014 09:30 AM
Thanks for your feedback.
I need to explore this.
We want to give end-users a dahboard or report to quickly check the status of scheduled jobs, to avoid that we need to give access to scheduling itself.
11-26-2014 10:08 AM
I think there are some good options for that requirement. A couple of them are the RTM (3rd Party software from IBM, same as LSF, but RTM is not maintained any more. But still good. And you don;t need a Grid License), and that SAS Environment Manager from SAS 9.4.
11-26-2014 04:06 PM
As you should understand the LSF tool a IBM product for reviewing your options on scheduling, start with those.
My simple approach:
- LSF is having his own database and does some highly advanced load balancing topics SAS is not aware. That is why this 3-party tool is added.
- why should LSF try to synchronize his metadata in some weird as the the SAS metadata? That does not make any sense.
11-27-2014 02:22 AM
LSF does not have to sync with SAS, but as scheduling is done from SAS Management Console, where apparently at least the scheduling is 'remembered' I would say this is somewhere stored in SAS (meta)data. And that is what I am looking for.
Actual execution and result we get from the batchscript that is run.
This is what I actually what I meant with the topic name: we have results, but not (yet) the plan. So the loop is not closed.
11-27-2014 02:26 AM
Funny, getting a good picture of "the plan" seems to be difficult in most scheduling SW. I have not been able to get a flowchart for a series of SAS and SAS-related jobs out of Control-M.
11-27-2014 05:11 AM
For your information: the platform process manager which comes with LSF does have a 'Global View' option where you can graphically see all flows etc, but this is then lacking the information about if it ran, and the results.
11-27-2014 01:45 AM
We do not use LSF, but maybe you can make use of some parts of what we do.
I have written a wrapper UNIX (ksh) shellscript that executes our SAS batch jobs; this script is called by the scheduling software (Control-M) and has its parameters fed via UNIX environment variables (like DAY, MONTH, YEAR, PARAM1-PARAMx, INFILE1-INFILEx, OUTFILE1-OUTFILEx)
This shellscript actually executes SAS three times:
- first to write a new record to a jobcontrol dataset that records all parameters and the actual execution time, setting the exit code value to -1, indicating an active program run, and (re)write a HTML page for all runs of the current day
- it then runs the batch job itself, writing a log that is coded with batch job name and execution time; the name of this log is included in the HTML page as a link. It also performs checks on the log file to catch certain unwanted conditions that don't cause a SAS WARNING or ERROR.
- then it uses SAS again to update the jobcontrol dataset with the actual exit code/finishing time and again writes the HTML
The HTML page is served from the data warehouse server with apache. Links to the daily page are updated on the main page, all daily pages are kept for history (as is the jobcontrol dataset)
This allows me (and our users) to get a quick view of what has been run or is running with a simple browser. And also take a look at the log.
11-27-2014 02:25 AM
We do exactly the same: with a minor adjustment to the sasbatch script we capture, the start of a sasjob, the name etc and other parameters for the sasjob, resultcode, and the end of the sasjob, and put this all in a table.
The only thing I am looking for is the planned start or frequency of the sasjob.
11-27-2014 02:54 AM
Strange how stubborn some people can be. The scheduler tool has the database of all jobs and all information in that.
You can access that by SMC, a 3270-terminal or whatever, it doesn't change the situation. The SMC for example does not have the information (on your local desktop) there you accepted that the real information is somewhere else for instance the SAS metadataserver. The next step is accepting there are several of those.r
A flowchart out of Ctrl-M (or LSF) I have seen it been done, it was OPC I beleive (now TWS). Read all information of scheduler database and put in something that can handle planning.
SAS has a planning part it is called SAS/OR and you get (ugly) figure out of that.
11-27-2014 03:47 AM
I consider this a compliment.
I do not want to go into terminals or whatever to see things.
I want to use as much just 'standard' tools and provided information to get a full picture of my scheduling cycles: from planning or setting up the schedule, to seeing when it ran with what success.
I was assuming (and I know what can come from that....) that as I use SMC to schedule various sas jobs and flows, in sas this information would be available, and retrievable through sas.
From your hammering I understand that you think that this is not the case.
11-27-2014 02:51 PM
Sorry Ton, I should not only come with a hammer but also with the screwdriver and more in the toolbox.
Schedulers are advanced toolsets and LSF is one of the advanced ones. IBM is promoting this for the Grid load options in favour of TWS (opc).
Tools like CTL-M are also already adanved but cannot do lad managing in a grid.
What you get with LSF is queues with jobs (LSF context) run planned, once or interactive where you can give goals (resources time ready etc.)
Often a scheduler that does repeatable jobs will learn from the monitored resource usage and do change to the system to achieve those goals. It will need process at root (Unix) level for that. On an Mainframe that are some SVC's exits going to intercept smf/rmf information to build an database with job behavior for profiling that.
You can use batch command to interact with LSF. just look at bjob bsub bhist.
From the admin guide:
LSF keeps track of all jobs in the system by maintaining a transaction log in the
work subtree. The LSF log files are found in the directory LSB_SHAREDIR/
The following files maintain the state of the LSF system:
LSF uses the lsb.events file to keep track of the state of all jobs. Each job
is a transaction from job submission to job completion. LSF system keeps
track of everything associated with the job in the lsb.events file.
The events file is automatically trimmed and old job events are stored in
lsb.event.n files. When mbatchd starts, it refers only to the lsb.events file,
not the lsb.events.n files. The bhist command can refer to these files
In config guide you will find a lsb.events chapter describing all what is done logged in files
11-28-2014 04:23 AM
Thanks for coming back on this.
I already did look at this file (lsb.events). And the commands you mention: bjobs, bhist, bsub etc.
So far they have not given me yet the pending jobs (unless they are almost starting) .
In folders for the Process-Manager there are files in flow-storage where I can find 'smaller' files which contain datetimes of lastrun and nextrun.
These are all plain text files that I could read and analyse.
They contain datetime, frequency that it should run, that it last ran, and when next it should run.
The files seem quite structured, looking a bit xml-ly, and with quite complex structures.
Which means that I need to reverse engineer this. Which at this moment is not worth the effort.
And then coming back to my first question: In order to create a page where my collegues can see the status of their scripts etc etc., I want to close the cycle of planning jobs, running them and seeing the running times, return codes. etc.
Thanks for your feedback up to now.
11-28-2014 07:33 AM
The life cycle of jobs in LSF is having a different concept as running SAS scripts. Sometimes using different words same meaning sometimes same words different meaning.
The pend is no real event status.
Your job remains pending until all conditions for its execution are met. Each queue has
execution conditions that apply to all jobs in the queue, and you can specify additional
conditions when you submit the job" page 24 users guide.
A message for the users is most easy by giving a email - pag 20 users guide.
The reasons for missing triggers (pend) is a specialized task for some support guy