Hello Dear Community!
I am trying to create an auto-rerun mechanism by implementing some code into sasbatch script after sascommand will finish. General idea is to:
1) locate a log of sas process and an id of the flow containing current job,
2) check if the log contains particular ORA-xxxxx errors,
3) if so, then trigger jrerun class from LSF Platform Command Line Interface,
4) exit sasbatch passing $rc to LSF
The idea was implemented as:
#define used paths
log_dir=/path/to/sas_logs_directory
out_log=/path/to/auto-rerun_log.txt
out_log2=/path/to/lsf_rerun_log.txt
if [ -n "${LSB_JOBNAME}"]; then
if [ ! -f "$out_log"]; then
touch $out_log
fi
#get flow runtime attributes
IFS-: read -r flow_id username flow_name job_name <<< "${LSB_JOBNAME}"
#find log of the current process
log_path=$(ls -t $log_dir/*.log | xargs grep -li "job:\s*$job_name" | grep -i "/$flow_name_" | head -1)
#set path to txt file containing lines which represents ORA errors we look for
conf_path-/path/to/error_list
#analyse process' log line by line
while read -r line;
do
#if error is found in log then try to rerun flow
if grep -q "$line" $log_path; then
(nohup /path/to/rerun_script.sh $flow_id >$out_log2 2>&1) &
disown
break
fi
done < $conf_path
fiWhile rerun_script is the script which calls jrerun class after sleep command - in order to let parent script exit $rc in the meanwhile. It looks like:
sleep 10 /some/lsf/path/jrerun
Problem is that job is running for the all time. In LSF history I can see that jrerun was called before job exited.
Furthermore in $out_log2 I can see message:
<flow_id> has no starting or exit points.
Do anyone have an idea how I can pass return code to LSF before jrerun calling? Or maybe some simplier way to perform autorerun of SAS jobs in Platform LSF?
I am using SAS 9.4 and Platform Process Manager 9.1
Thank you for all your support,
Max.
What is causing your Oracle errors in the first place? How can you be sure that a re-run won't just give you the same errors?
If the errors are caused by Oracle data not being available when it should be then a better approach might be to test this in your SAS program and keep looping until it is available within a certain time limit.
Hello, thank you for your response.
This mechanism is designed for particular ORA errors we encounter that we know they can be resolver just by rerun. It is the mechanism just designed for them.
If that is the case you could also check for that error in your SAS code and repeat the step. It might be easier than an LSF solution.
Nearly 200 sessions are now available on demand in the Innovate Hub.
Watch Now →SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.