Problem:
I have a very long running process, due to inconsistent server load determining when the process will complete is difficult based on wall time alone. The process runs in batch mode on a unix server.
Possible Solution:
Here is an example of the functionality I would like to replicate (code below is perl). Put a note in the log each time a given number of records has been processed.
while (<INFILE>) {
$counter++;
if ($counter%500000 == 0) {
print "Working... $counter \n";
}
#Do Stuff;
}
print "All Done... $counter \n";
The above will read a file and each time it reaches an increment of 500,000 records will print a status msg while the program is working to let the user know the progress of the script.
This is what I want to replicate in SAS.
Should I add -logparm "write=immediate" to my config file, does this actually print immediately while still processing a datastep?
I am not currently able to test but I know in my current enviornment the following will not work:
data _null_;
do i=1 to 10000000;
call ranuni(1234);
if mod(i,50000)=0 then putlog i=;
end;
run;
If the write=immediate option doesn't resolve the issue my next thought is to call sysexec and cat a note onto the end of the logfile (if I even can incase there is a lock on the file held by sas while using write=buffered?
Has anyone tried something like this?
Have any other ideas to implement a similar function?
I finally had a chance to go back and test this.
I was downloading a very large dataset (10 billion rows, 500GB) across a slow WAN connection from an Oracle database I knew it would take many hours to complete (12 to be exact) and I wanted to have an additional way to make sure observations were getting written other than logging into the file system and checking file sizes and network traffic etc...
data das.segment_broadcast_fact;
if not mod(_n_,500000000) then put _n_=;
set oradb2.web_session_fact
( keep = dt
client_nbr
user_session_id
broadcast_seg_type_cd
dbsliceparm = (ALL,10)
where=(month(dt)=10)
);
run;
with this code I get a note immeditaely to my log every 500 millionth iteration in the datastep.
Have you tried write=immediate?
You might want combine it with an altlog specification.
/Linus
You also might want to post your question over on SAS-L (http://www.listserv.uga.edu/cgi-bin/wa?A0=sas-l&D=1&H=0&O=D&T=1) as Michael Raithel looks there, but not here, and he is a whiz at coming up with such solutions. Just a thought.
I finally had a chance to go back and test this.
I was downloading a very large dataset (10 billion rows, 500GB) across a slow WAN connection from an Oracle database I knew it would take many hours to complete (12 to be exact) and I wanted to have an additional way to make sure observations were getting written other than logging into the file system and checking file sizes and network traffic etc...
data das.segment_broadcast_fact;
if not mod(_n_,500000000) then put _n_=;
set oradb2.web_session_fact
( keep = dt
client_nbr
user_session_id
broadcast_seg_type_cd
dbsliceparm = (ALL,10)
where=(month(dt)=10)
);
run;
with this code I get a note immeditaely to my log every 500 millionth iteration in the datastep.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.