I recently joined a team that has templates built for 20+ different SAS process flows. It looks like the original template was built yrs ago and Saved As a new template for each line of business we work with. Each model may look slightly different based on the LOB we are working with. The problem is when the original was built it wrote data to tmp tables on the server, each model uses the same data output and they are all named the same. So if 2 people are running different templates, because the data output saved on the tmp tables is the same name, the runs overwrite each other. I've updated any program in the process flow to include a %LET to establish a prefix that appends to each of the tmp tables that are created which should work fine. My challenge now is that each query needs (50+) to be manually updated to include the new prefix, then I run the query to create a new data output, delete that output, then change the current data output properties to point to the new data output I just created. This takes hours to update and we have 20+ templates that need changes made to avoid conflict of overwriting tmp tables. Any suggestions would be appreciated to reduce the manual effort needed.
@beamer108 wrote:
I mean to say that when the flow runs it creates tmp tables that are referenced in other steps. The problem is that the temporary table names are not unique so if I run a process flow its creating tmp7day.000_ABC, if my teammate runs a diff process flow that also creates tmp7day.000_ABC theirs will overwrite mine. I did a search in the process flow to look for libname and I don't see where there is a libname for tmp tables.
So it looks like everyone is defining TMP7DAY to point to the same location.
If you point it to a different location for each user you can reduce the risk of conflict to just conflicts with yourself. (Which is NOT a trivial risk for normal SAS usage. I frequently run many many many SAS jobs at the same time, but for something being run interactively with Enterprise Guide the risk is probably smaller.)
What about switching these tables to SAS WORK tables? Then you can keep the same name for all of them as they will be written to each user's separate WORK library.
How long does the data stay in WORK tables?
For the duration of the EG session.
Is there an easy way to change all the table names to 'work' or do I need to manually go into each of the queries and programs? This one process flow I am testing has 58 queries and progames and we have 28 different process flows that the changes will need to be made in. Thanks.
What do you mean by "it wrote data to tmp tables on the server"?
Normally SAS tables are two level <libref>.<table name>
<libref> can be WORK which is SAS session specific and though user will never collide.
Else I'd expect there is some libname that creates the libref for your "tmp" table - and if so then it's eventually possible to create user or even session specific paths during runtime - and though also these paths would become user or even session specific.
Is there somewhere a libname for these tmp tables? Where is it (in code, metadata or autoexec)?
tmp7day is as SAS libref that needs to be defined somewhere. It's either in code, in SAS metadata or in an autoexec. Try to find it.
A libref points to some physical location. My line of thought is to change the library definition so the path it points to is user specific.
Ideally you can change the libname definition in a central place. If that's not possible then something like below could also work.
Let's say the existing libref points to c:\temp
libname tmp7day "c:\temp";
You could as first step in your EG project re-point this library to a user specific folder. Code like below should work.
options dlcreatedir;
libname tmp7day "%sysfunc(pathname(tmp7day))\&sysuserid";
Just make sure to not execute above code more than once in an EG session or build some additional logic to prevent this libname to execute more than once in a SAS session. Code like below could do this.
%let sv_dlcreatedir=%sysfunc(getoption(dlcreatedir));
options dlcreatedir;
data _null_;
if find(pathname('tmp7day'),"&sysuserid",'i') <=0 then
do;
rc=libname('tmp7day',catx('\',pathname('tmp7day'),"&sysuserid"));
end;
run;
options &sv_dlcreatedir;
40 libname tmp7day list; NOTE: Libref= TMP7DAY ....... Filename= c:\temp\<and here the userid of the user that executes the script> .......
Or if these tmp tables really only need to exist during a SAS session then re-point libref tmp7day to the SAS WORK library.
The SAS WORK library gets automatically created during SAS invocation and it points to a session specific folder location that gets created when SAS starts and that gets removed when SAS terminates.
Code like below would allow you to define an additional libref that also points to this WORK location.
libname tmp7day "%sysfunc(pathname(work))";
Thank you for that suggestion - I am working with my sas admins to get this set up. The only last challenge we foresee is that when when SAS EG loses the connection to the SAS Compute server (say we are waiting for long running query or get pulled away) and then is automatically restored, the libref to tmpNday disk gets reset to the default so you will have to rerun the code to remap them. The loss of connection isn't that obvious, it only appears in a warning box alert at the top of our eg session but itsn't that noticeable, so if you walk away while a process is running and don't realize your connection was lost during that time, we are now writing to the default tmpnday. Any suggestions or ideas would be appreciated.
If reliability is an issue for you running long jobs via EG, then consider switching to server-based job scheduling - it's way more reliable. We don't use EG for any regular long-running jobs, we switch to server-based scheduling using SAS Management Console, a point-and-click interface for (amongst other things) scheduling and running regular jobs in batch mode. You will have to export your EG projects to SAS programs to do this however. Your SAS admins should know all about this.
@beamer108 wrote:
I mean to say that when the flow runs it creates tmp tables that are referenced in other steps. The problem is that the temporary table names are not unique so if I run a process flow its creating tmp7day.000_ABC, if my teammate runs a diff process flow that also creates tmp7day.000_ABC theirs will overwrite mine. I did a search in the process flow to look for libname and I don't see where there is a libname for tmp tables.
So it looks like everyone is defining TMP7DAY to point to the same location.
If you point it to a different location for each user you can reduce the risk of conflict to just conflicts with yourself. (Which is NOT a trivial risk for normal SAS usage. I frequently run many many many SAS jobs at the same time, but for something being run interactively with Enterprise Guide the risk is probably smaller.)
Thank you to everyone for the suggestions and support. We have a process in place and are testing it now.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Check out this tutorial series to learn how to build your own steps in SAS Studio.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.