Has anyone ever succeeded in running SAS in parallel on a SLURM-based Linux cluster computer using what are known as "job arrays"?
https://rcc.uchicago.edu/docs/running-jobs/array/index.html
In particular, I need to pass a SLURM environment variable SLURM_ARRAY_TASK_ID from my batch shell script to SAS. This task has completely defeated me.
Google searches haven't helped. I see mention of multi-threading, but that's not quite the same.
I would be so appreciative for some expert help!
Either add
export SLURM_ARRAY_TASK_ID
to your shell script before starting SAS, or use the -sysparm option to hand SLURM_ARRAY_TASK_ID over on the SAS commandline.
I don't know anything about SLURM but the SYSGET SAS function reads OS environment variables.
Also SAS jobs can be run in parallel using SAS Grid or SAS/CONNECT technology.
Unfortunately the SAS command
%let SLURM_ARRAY_TASK_ID=%SYSGET(SLURM_ARRAY_TASK_ID);
does not seem to work. I'm going to guess that SLURM_ARRAY_TASK_ID isn't an "environment variable" in the same sense you mean, but I could be wrong. Thanks for your reply!
If you could get your shell script to also write it to a "proper" OS environment variable as well that might work. Someone who is more familiar with Unix should be able to help.
Either add
export SLURM_ARRAY_TASK_ID
to your shell script before starting SAS, or use the -sysparm option to hand SLURM_ARRAY_TASK_ID over on the SAS commandline.
Thanks for your suggestions! I've tried many variations, including the lines:
#!/bin/bash
#SBATCH --job-name=job_array
#SBATCH --output=job_array_%a_out.txt
#SBATCH --error=job_array_%a_err.txt
#SBATCH --array=1-3
echo "My SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
export SLURM_ARRAY_TASK_ID
sas -noterminal job_array.sas -sysparm $SLURM_ARRAY_TASK_ID
in my shell script and the lines
%let TskID=%scan(&sysparm,1);
%put &TskID;
in my SAS program. Regrettably the value of TskID seems to be empty; the integers 1, 2, 3 seem not be passing into SAS. More help is definitely needed and appreciated!
With the export in place, did you try
%let tskid=%sysget(SLURM_ARRAY_TASK_ID);
?
Yes. With the export in place, the SAS lines:
%let TskIDa=%SYSGET(SLURM_ARRAY_TASK_ID);
%put &TskIDa;
unfortunately give empty TskIDa. Any other ideas? I'm grateful for your help. If passing the SLURM_ARRAY_TASK_ID parameter from shell script to SAS is impossible, then this is a disappointment. I don't wish to give up quite yet, but I'm at a loss what to try next.
Are you sure you have a standard SAS command line script and not some local script that is "eating" your command line arguments?
Try just passing in any text to the SYSPARM option. So make a trivial SAS program.
%put &=sysparm;
Then call it with the -sysparm option.
sas -sysparm fred myprogram.sas
In the log you should see
SYSPARM=fred
If that works then replace fred in the command string with the reference to your environment variable.
Thank you. The command in my shell script:
sas -noterminal trivial.sas -sysparm $SLURM_ARRAY_TASK_ID
plus your suggested command in my SAS program:
%put &=sysparm;
indeed gives me an integer (1, 2 or 3). Thus something is wrong with my SAS commands:
%let TskID=%scan(&sysparm,1);
%put &TskID;
which give TskID as empty. How should I correct these? I need, of course, to use TskID later in my program.
Unfortunately my attempt to memorize the sysparm value:
%let TskID=&sysparm;
%put &TskID;
still appears to be empty. Why should your command:
%put &=sysparm;
work so beautifully but not mine? I appreciate your suggestions!
This simple program should work.
1 %let TskID=&sysparm; 2 %put &TskID; fred
There is no reason that %SCAN() shouldn't work either. You could check to see if SYSPARM has gotten any strange binary characters into it that are confusing SAS into thinking it has macro quoting. Try a simple data step and see it if helps. So you might use SYMGET() to pull the value SYSPARM and then use PUT with $HEX format to see if there are any strange characters in the first 25 bytes. You could then use CALL SYMPUTX() to generate your new macro variable.
data _null_;
length tskid $200 ;
tskid=symget('sysparm');
put tskid = / tskid $hex50. ;
call symputx('tskid',tskid);
run;
If it still doesn't work then either your shell script or your SAS code is messing up somewhere.
Perhaps there is something in your larger shell program that is causing the -sysparm option to not be properly set on the command line. Perhaps somehow the environment variable reference is being deferred to some later time where it evaluates to empty? Or these is some other line in the code that is clearing the value before the call to SAS?
Same thing for the SAS side. Do you have other statements that are changing TSKID or SYSPARM? Are you sure you are not creating TSKID in a local symbol table and then looking for it in the global symbol table?
Thanks for all your thoughts. I implemented your data step and obtained the following printout in the log file:
tskid=3
33202020202020202020202020202020202020202020202020
What is meaning of the 50-character string? It looks ominous.
SLURM_ARRAY_TASK_ID is analogous to a do loop variable, covering the integers {1, 2, 3}, however, all at once.
I said the log file, but in fact I have three copies of SAS running in parallel. Does the difficulty reside in the fact that outcomes for three simultaneous processes are being collapsed into just one file? Maybe collisions are making this unworkable.
This is why I need TskID -- to distinguish between the processes -- but I don't know how to introduce it into SAS's naming of log files, etc., to prevent the collisions from taking place.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.