Help using Base SAS procedures

Iteration by PROC APPEND takes a long time.

Reply
New Contributor
Posts: 3

Iteration by PROC APPEND takes a long time.

Hello. I intend to get a dataset by appending many outputs of PROC FREQ with each variables and the outcome.

Although the number of iteraition is about 50,000, it takes half a day.

Could you tell me the better way for calculating faster ?

I appreciate your kindness helping.

The program is as following: 

//////////////////////////////////////////////////////

%macro outtest(outcome=);

*;

%let dsid=%sysfunc(open(a1,i));

%let vnum=%sysfunc(attrn(&dsid,nvars));

%let rc=%sysfunc(close(&dsid));

*;

*;

%do no=1 %to &vnum;

proc freq data=a1 noprint;

   output out=nout&no trend exact;

   tables &outcome*numgeno&no / exact trend;

run;

%end;

*;

data n1;

  set nout1;

run;

*;

%do no=2 %to &vnum;

proc append base=n1 data=nout&no force;

run;

%end;

*;

%mend;

*;

%outtest(outcome=COL1D);

//////////////////////////////////////////////////////

Best regards,

Super User
Super User
Posts: 7,060

Iteration by PROC APPEND takes a long time.

I suspect that it is the EXACT statistics request that is taking a long time. You can look at your SAS log and see how long each step takes to confirm.

Not sure why you are using a macro and looping through the variables. 

Why not just tell SAS to do them all at once?  Your macro is assuming that you have variables named NUMGENO1 to NUMGENO&n where &N is the number of variables in the dataset. (not sure how that is possible if you also have a variable named COL1D in the dataset.)

proc freq data=a1 noprint;

   output out=n1 trend exact;

   tables &outcome*(numgeno1 - numgeno&vnum) / exact trend;

run;

New Contributor
Posts: 3

Iteration by PROC APPEND takes a long time.

Dear Tom;

Thank you for your kindly response !

I've got your advice by executing. 

Sincerely,

Respected Advisor
Posts: 3,799

Iteration by PROC APPEND takes a long time.

You have 50,000 variables?

I don't doubt your program takes a very long to run, partly due to the statistics you request but you still should be able to do it.  Macro looping over 50,000 variables in a receipt for disaster. :-)

You don't need to call PROC FREQ over and over.  It will work for many(all) variables from a data set in one step, and produce output in this case two files one for Exact another for Trend.  No need for macro loops or PROC APPEND.   I have never tried 50,000 variables so you may run into some problem there.  If that happens you could try using a few lists 

NUMGENO1-NUMGENO10000

NUMGENO10001-NUMGENO20000

...

a least you are not calling FREQ/APPEND 50,000 times.

Use the example below to modify your program to work without macro loops that you don't need.

ods listing close;

ods output FishersExact=FishersExact TrendTest=TrendTest;

proc freq data=sashelp.class;

   tables sex*(_all_) / exact trend;

   run;

ods output close;

ods listing;

proc print data=FishersExact;

proc print data=TrendTest;

   run;

New Contributor
Posts: 3

Iteration by PROC APPEND takes a long time.

Posted in reply to data_null__

Dear data_null_;

Thank you for your kindly response !

I've got your advice by executing it. 

Sincerely,

Ask a Question
Discussion stats
  • 4 replies
  • 313 views
  • 6 likes
  • 3 in conversation