BookmarkSubscribeRSS Feed
Vince28_Statcan
Quartz | Level 8

Hi,

As my work forces me to process jobs of a few hours length on my tower rather than remotely, I have taken the habit of looking into the SAS session temporary files folder to obtain an idea of the progress pace looking at utility files progression.

For a current project to try and push my coworkers into learning and using hash tables and other more recent SAS tools (anything SAS v8+ has been ignored by the masses and it takes fresh blood like me (January hire) to tell people what's worth looking into). After running my first test on a SAS 9.2 32 bit tower with 4gig ram and again on a SAS 9.4 64 bit test tower with 16 gigs, I had to scale my test data upwards significantly before I throw this example results both in a presentation and on a shared drive for people to test for themselves.

Anyway, to the point - here's the data that I build to compare a simple routine task of inner merge that gets significantly sped up by using hash objects instead of proc sql.

Here's the code

data large;

     array vars{101} var10-var110 {101*0}; /* Added to increase the number of pages needed to process the data for 64 bit tower */

     do id=1 to 30000000;

          var1 = mod(id, 6);

          var2 = mod(id, 7);

          output;

     end;

run;

proc surveyselect data=large out=small noprint

          sampsize=5000000

          method=SRS;

run;

data small;

     set small;

     var3=mod(id, 8);

     var4=mod(id, 9);

     var5=mod(id, 10);

     var6=mod(id, 11);

     var7=mod(id, 12);

     var8=mod(id, 13);

     var9=mod(id, 14);

run;

proc sort data=small;

     by var8 var6 var3 var9 id;

run; /* Simply to prevent proc sql from utilising the fact that both DS are already sorted as you would then normally have to account for the proc sort or order by statement length to compare the 2 methods */

I want to figure out expected time to complete as it has to be runnable overnight for 32 bit OS users as otherwise I'll have to create an OS/SAS version dependant program and if I head to that, I'd rather save waiting for the full process to complete

My question is in 2 parts

1. Is there anything significantly more efficient (with easy code readability since this will be shared) than proc surveyselect to obtain a SRS?

2. How can I track proc surveyselect progression (aka I wish to have guesstimates of how much longer the proc should take) as I usually would with utility files for proc sorts?

Thanks

Vince

1 REPLY 1
SteveDenham
Jade | Level 19

Most of the stat procs aren't set up for monitoring ongoing progress.  If you can come up with a way, I want to be able to port it to the mixed model procedures, as we have several that run for hours or even days in a 64bit windows environment.

Steve Denham

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 740 views
  • 3 likes
  • 2 in conversation