- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am working in grid environment with LSF license, but without monitoring tools RTM Tool and Grid Manager plugin. Used to monitor through LSF commands
I need to find out how many jobs were ran by all users in the last one month and related stats like user ,memory usage, runtime, cpu, host.
Appreciate if someone helps on this.
Thanks in Advance,
Siddhu1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Have you tried bhist with the -T option?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Tried bhist command with the option -T, not able to get the full information.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What was missing? 'bhist -u all -T <date/time info>' provides most of you want. There is also bacct.
Having something like RTM or SAS Environment Manager Server's time-series data collection really help in this area.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As the RTM and Environment Manager is not working, due to some issues. Using the LSF commands.
But unable to get the full information using bhist & bacct, as i need Job Need history of the last month.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
As the RTM and Environment Manager is not working, due to some issues.
But unable to get the full information using these commands.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Does bhist -l <job_id> provide all the information you need for a single job?
If so, you could use bhist -T to get the jobs that were run at that time and run bhist -l against each of them.
For example, this command would run bhist -l against each in the history from 1 day ago:
for j in $(bhist -t -T "$(date -d "1 day ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | grep Job.*submitted -o | cut -f2 -d" " | egrep '[0-9]+' -o); do bhist -l $j; done
You could take this a step further and have SAS parse the output of this into a SAS data set if you start the SAS session with XCMD. For example:
/* Create a table of job ids submitted within the supplied range. */
filename command pipe 'bhist -t -T "$(date -d "1 day ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | grep Job.*submitted -o | cut -f2 -d" "';
data jobs;
length jobid 8;
infile command;
input;
jobid = compress(_infile_,"<>");
put jobid;
run;
/* Create an empty destination table. */
data jobinfo;
length Job $ 10 User $ 50 Project $ 50 Command $ 255 max_mem avg_mem $ 20 pend psusp run ususp ssusp unknwn total 8;
call missing (of _all_);
stop;
run;
/* Define a macro to read in bhist output for a supplied job and add it to the table. */
%macro bhistreader(job_id=);
/* Define the bhist command. */
filename bhist;
filename bhist pipe "bhist -l &job_id";
/* Create a temp file to store the edited output. */
filename bhist2;
filename bhist2 temp;
/* Remove formatting from command output and store it in the temp file. */
data _null_;
infile bhist;
file bhist2;
input;
if _infile_ ne: " " and _N_ > 1 then put ;
line=strip(_infile_);
put line +(-1) @@ ;
run;
/* Create a data set, jobinf, to store the history information from our unformatted file. */
data jobinf;
length Job $ 10 User $ 50 Project $ 50 Command $ 255 ;
call missing (of _character_);
infile bhist2 dlm=',';
input @'Job' Job @'User' User @'Project' Project @'Command' Command;
Job=compress(Job,"<>");
User=compress(User,"<>");
Project=compress(Project,"<>");
Command=compress(Command,"<>");
run;
/* Extract memory usage information from the output, only reading the memory line. */
data meminfo;
length line $ 512 max_mem avg_mem $ 20;
infile bhist2;
input;
line=strip(_infile_);
if scan(line,1)="MAX" then do;
max_mem=cat(scan(line,3),scan(line,4));
avg_mem=cat(scan(line,7),scan(line,8));
output;
end;
drop line;
run;
/* Read in the table of state times from the output. */
data times;
length line $ 512 pend psusp run ususp ssusp unknwn total 8;
infile bhist2;
input @;
line=strip(_infile_);
prefix=scan(line,1);
put prefix=;
if prefix="PEND" then do;
input;
input pend psusp run ususp ssusp unknwn total;
output;
end;
drop line prefix;
run;
proc sql;
insert into work.jobinfo
select * from jobinf,meminfo,times;
quit;
%mend;
/* Run the macro for each job in the job table. */
data _null_ ;
set jobs;
str=catt('%bhistreader(job_id=',jobid,');');
call execute(str);
run;
proc print data=jobinfo; run;
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the response.
Basically, I want to compare the stats from the jobs(SAS Programs) that ran in the Grid Nodes last month with number of jobs,cpu stats, memory stats,run time to compare with the latest month.
I was trying with bhist -C <timeframe>, unable to get the cpu and memory stats and the output is also very lenghty.
Need to find the peak jobs at a particular interval using the above stats for the grid performance issue.
Thanks in Advance,
Siddhu1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Low level metrics, like how much memory/cpu was used by a particular job are in bhist -l <job_id> output.
The Service Architecture Framework's SASJobs kit when configured will capture usage information from SAS logs FULLSTIMER, which would provide memory and CPU consumption information specifically for SAS sessions.
If you want to use the CLI only, I think you'll need to use bhist -l <job_id> to get that level of detail.
Understanding SAS Environment Manager Service Architecture
https://go.documentation.sas.com/?cdcId=evcdc&cdcVersion=2.5_M1&docsetId=evug&docsetTarget=p0md48tpf...
Greg Wootton | Principal Systems Technical Support Engineer
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the reply Greg.
I don't have the SAS Environment Manger and RTM tools for getting the resources.
Just using CLI LSF commands to monitor the Grid.
Need to get the no. of JOBs (SAS Programs) that ran in the Grid environ for the last months with CPU,Memory,run time history & also the peak jobs that ran in a specific time daily, to assess the Grid Environment internally.
Thanks & Regards,
Siddhu1
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This command would list the number of jobs from a month ago to today (you can change the date commands to use specific dates):
bhist -t -T "$(date -d "1 month ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | grep Job.*submitted -o | cut -f2 -d" " | wc -l
This would return those job IDs:
bhist -t -T "$(date -d "1 month ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | grep Job.*submitted -o | cut -f2 -d" " | egrep '[0-9]+' -o
You could use a for loop to run bhist -l for each one:
for job in $(bhist -t -T "$(date -d "1 month ago" +%Y/%m/%d/%H:%M),$(date +%Y/%m/%d/%H:%M)" | grep Job.*submitted -o | cut -f2 -d" " | egrep '[0-9]+' -o); do bhist -l $job; done
Perhaps with awk you could parse that output into a more readable format.
Greg Wootton | Principal Systems Technical Support Engineer