BookmarkSubscribeRSS Feed
pofu
Calcite | Level 5

Hello. I intend to get a dataset by appending many outputs of PROC FREQ with each variables and the outcome.

Although the number of iteraition is about 50,000, it takes half a day.

Could you tell me the better way for calculating faster ?

I appreciate your kindness helping.

The program is as following: 

//////////////////////////////////////////////////////

%macro outtest(outcome=);

*;

%let dsid=%sysfunc(open(a1,i));

%let vnum=%sysfunc(attrn(&dsid,nvars));

%let rc=%sysfunc(close(&dsid));

*;

*;

%do no=1 %to &vnum;

proc freq data=a1 noprint;

   output out=nout&no trend exact;

   tables &outcome*numgeno&no / exact trend;

run;

%end;

*;

data n1;

  set nout1;

run;

*;

%do no=2 %to &vnum;

proc append base=n1 data=nout&no force;

run;

%end;

*;

%mend;

*;

%outtest(outcome=COL1D);

//////////////////////////////////////////////////////

Best regards,

4 REPLIES 4
Tom
Super User Tom
Super User

I suspect that it is the EXACT statistics request that is taking a long time. You can look at your SAS log and see how long each step takes to confirm.

Not sure why you are using a macro and looping through the variables. 

Why not just tell SAS to do them all at once?  Your macro is assuming that you have variables named NUMGENO1 to NUMGENO&n where &N is the number of variables in the dataset. (not sure how that is possible if you also have a variable named COL1D in the dataset.)

proc freq data=a1 noprint;

   output out=n1 trend exact;

   tables &outcome*(numgeno1 - numgeno&vnum) / exact trend;

run;

pofu
Calcite | Level 5

Dear Tom;

Thank you for your kindly response !

I've got your advice by executing. 

Sincerely,

data_null__
Jade | Level 19

You have 50,000 variables?

I don't doubt your program takes a very long to run, partly due to the statistics you request but you still should be able to do it.  Macro looping over 50,000 variables in a receipt for disaster. 🙂

You don't need to call PROC FREQ over and over.  It will work for many(all) variables from a data set in one step, and produce output in this case two files one for Exact another for Trend.  No need for macro loops or PROC APPEND.   I have never tried 50,000 variables so you may run into some problem there.  If that happens you could try using a few lists 

NUMGENO1-NUMGENO10000

NUMGENO10001-NUMGENO20000

...

a least you are not calling FREQ/APPEND 50,000 times.

Use the example below to modify your program to work without macro loops that you don't need.

ods listing close;

ods output FishersExact=FishersExact TrendTest=TrendTest;

proc freq data=sashelp.class;

   tables sex*(_all_) / exact trend;

   run;

ods output close;

ods listing;

proc print data=FishersExact;

proc print data=TrendTest;

   run;

pofu
Calcite | Level 5

Dear data_null_;

Thank you for your kindly response !

I've got your advice by executing it. 

Sincerely,

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1372 views
  • 6 likes
  • 3 in conversation