Help using Base SAS procedures

Do loop to run through all combinatins of 2 variables in proc step

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Do loop to run through all combinatins of 2 variables in proc step


I am trying to use the do/until loop to work through every combination of multiple comparisons for a data set of categorical variables with 2+ categories.

I was thinking I would use a do loop to work through them, but am not sure if I am constructing this correctly given for this to work a) they need to be nested so that K (sexuality0 runs through each number first, followed by L (Ques) working through until it reaches the final option. b) I am wondering if there is a way to avoid repeating combinations (so with removal) as I want the  bonferoni correction at the end. I spend a very long time looking for a way to efficiently do multiple comparisons for post hoc chi-square analaysis of categories and came up with nothing---so would super appreciate the help!

 

Caveat: never used a do/until loop so not sure if this can be run outside of a data step. Eventually I plan to make this into a macro, but want to make sure the logic is correct first.

 

Data looks like this with Ques having categories 0-4 and Sexuality from 0-3:
Ques Sexuality
0 1
0 3
1 2
4 1 etc.

 

SAS code:

K = 0; L= 0;
do until(K = 3);
do until(L = 4);
ods output chisq(persist)=chisq(where=(statistic="Chi-Square")rename=(Prob=Raw_P));
proc freq data=Ivf;
where Ques in (K, L);
tables Ques*sexuality / chisq cellchi2 nopercent;
K+ +1;end;
L+ +1;end;
run;
ods output clear;
proc print noobs;
var value raw_p;
run;
ods output clear;
proc print noobs;
var value raw_p;
run;
proc multtest pdata=chisq bon;
run;

 


Accepted Solutions
Solution
‎05-05-2017 09:03 AM
Super User
Super User
Posts: 6,495

Re: Do loop to run through all combinatins of 2 variables in proc step

[ Edited ]

You still haven't completely converted your code to take advantage of the macro.

For example you have created macro variables K and L but your WHERE statement is not references them. Instead it is trying to reference data set varaibles K and L. To reference the value of a macro varaible you need to use & macro trigger.

where &factor1. in (&k, &L);

There is no need to nest macro definitions. If you want to nest them then just call another macro from within the body of a macro. So if you pull the definition of %SUBCATEGORY() out of the definition of %FACT() you will see that it actually does nothing since it never calls any SAS code or macro code.  You really don't need two macros here unless you add more logic to the outer one.

 

%macro fact
(factor1   /* variable name for the question */
,factor2   /* variable name for the group */
,n         /* Number of categories in FACTOR1 */
,dsn = IVF_2 /* Name of dataset */
);
%local K L ;
* Make sure that Aggregate table does not exist ;
proc delete data=all_chisq; 
run;

%do k = 0 %to &n;
  %do l = &k %to &n;

ods output chisq(persist)=chisq(where=(statistic="Chi-Square")rename=(Prob=Raw_P));

proc freq data=&dsn;
  where &factor1 in (&k &l);
  tables &factor1*&factor2 / chisq cellchi2 nopercent;
run;

* Aggregate individual tables ;
proc append base=all_chisq data=chisq force;
run;

  %end;
%end;

%mend fact;


%fact (Q4,sexuality, 9);

 

View solution in original post


All Replies
Super User
Posts: 6,927

Re: Do loop to run through all combinatins of 2 variables in proc step

Your logic makes no sense, as you will accept two values for ques in every iteration, but never select for sexuality anyway. And if you only want one value for ques and sexuality (respectively) in the where, the cross-tabulation will make no sense.

 

Do is a data step statement only. But before you repeat code with a macro (%do), try by-group processing first.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 5

Re: Do loop to run through all combinatins of 2 variables in proc step

You are right! I altered the overall logic when I tried to redo the macro I had been working through:

 

%macro subcategory(sub1, sub2);
K = 0;
do until(K = &sub2.);
L = K+1;
do until(K = &sub2. );
ods output chisq(persist)=chisq(where=(statistic="Chi-Square")rename=(Prob=Raw_P));
proc freq data=Ivf;
         where &factor1. in (K, L);
         tables &factor1.*&factor2. / chisq cellchi2 nopercent;
         run;
%mend subcategory;

Here it is clearer as what I am actually trying to accomplish is to have every combination 0-x (x being the number of subcategories in a factor (Ques) placed within those parenthises. So (0,1) (0,2) (0,3) etc. but am struggling with how to capture ones like (2,3). I feel like this is a matter of how to order or maybe nest the do loop so that it runs through K (0-9) and then indexing 1 while also having L index 1 each time.  I might be totally off, & if so I would really appreciate some thoughts about a better way!

Super User
Posts: 6,927

Re: Do loop to run through all combinatins of 2 variables in proc step

Wrap your code in two macro do loops:

%do k = 0 %to &sub2.;
  %do l = &k. %to &sub2.;
/* your code */
  %end;
%end;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 5

Re: Do loop to run through all combinatins of 2 variables in proc step

Thank you! That is very helpful. My only remaining question is where to place these %do loops.

 

I realize it needs to be inside of my macro, but when I do this it appears to conflict with the order of calling the macros. I nest it this way because I then want to do the opposite (look within factor2) within the same %macro fact.  Do you happen to know if the %do macros need placed elsewhere for the other outer macros to work? I've done nested macros similar to %macro fact & %macro subcategory before so I am struggling to figure out why it won't run. Thank you so much!!

 

%macro fact(factor1, factor2);

%macro subcategory(sub2);
%do k = 0 %to &sub2.;
  %do l = &k. %to &sub2.;

ods output chisq(persist)=chisq(where=(statistic="Chi-Square")rename=(Prob=Raw_P));
proc freq data=Ivf_2;
         where &factor1. in (k, l);
         tables &factor1.*&factor2. / chisq cellchi2 nopercent;
         run;
  %end;
%end;
%mend subcategory;

%mend fact;

%subcategory (9);/*(lowest number, highest number for subcategory within a question)*/

%fact (Q4,sexuality); /*factor1 is the variable name for the question; factor2 is variable name for group i.e country*/
Solution
‎05-05-2017 09:03 AM
Super User
Super User
Posts: 6,495

Re: Do loop to run through all combinatins of 2 variables in proc step

[ Edited ]

You still haven't completely converted your code to take advantage of the macro.

For example you have created macro variables K and L but your WHERE statement is not references them. Instead it is trying to reference data set varaibles K and L. To reference the value of a macro varaible you need to use & macro trigger.

where &factor1. in (&k, &L);

There is no need to nest macro definitions. If you want to nest them then just call another macro from within the body of a macro. So if you pull the definition of %SUBCATEGORY() out of the definition of %FACT() you will see that it actually does nothing since it never calls any SAS code or macro code.  You really don't need two macros here unless you add more logic to the outer one.

 

%macro fact
(factor1   /* variable name for the question */
,factor2   /* variable name for the group */
,n         /* Number of categories in FACTOR1 */
,dsn = IVF_2 /* Name of dataset */
);
%local K L ;
* Make sure that Aggregate table does not exist ;
proc delete data=all_chisq; 
run;

%do k = 0 %to &n;
  %do l = &k %to &n;

ods output chisq(persist)=chisq(where=(statistic="Chi-Square")rename=(Prob=Raw_P));

proc freq data=&dsn;
  where &factor1 in (&k &l);
  tables &factor1*&factor2 / chisq cellchi2 nopercent;
run;

* Aggregate individual tables ;
proc append base=all_chisq data=chisq force;
run;

  %end;
%end;

%mend fact;


%fact (Q4,sexuality, 9);

 

Super User
Posts: 6,927

Re: Do loop to run through all combinatins of 2 variables in proc step

Macro fact is actually an empty macro. Nesting macros is not feasible in SAS, as all macros are defined globally. Nesting them in text only causes confusion.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 5

Re: Do loop to run through all combinatins of 2 variables in proc step

Oh...that's great to know! I think I've been thinking in terms of nested macros for awhile now & probably only just now came across a time when that is an issue. My frustration probably still remains with the %do loop as I've yet to create one that works ---& I feel as though it'd be quite powerful 

Occasional Contributor
Posts: 5

Re: Do loop to run through all combinatins of 2 variables in proc step

Again, this is SOOO helpful. Especially clarifying the process. It still has errors, but they pertain to "k", and I cannot figure out why that is. You even defined k as being local to the macro & yet this is the error. If you can think of why it seems to not be accepting the %do portion than that would be awesome! But, you've already been a tremendous help Smiley Happy

31        proc freq data=&dsn;   where &factor1 in (k, l);
                                                    -
                                                    22
                                                    76
ERROR: Syntax error while parsing WHERE clause.

ERROR 22-322: Syntax error, expecting one of the following: a quoted string, a numeric constant,
              a datetime constant, a missing value, -.
Super User
Super User
Posts: 6,495

Re: Do loop to run through all combinatins of 2 variables in proc step

If you want to use the value of a macro variable to generate SAS code then you need to expand the value by using &.

 where &factor1 in (&k, &l);
Valued Guide
Posts: 797

Re: Do loop to run through all combinatins of 2 variables in proc step

[ Edited ]

If your original data set is big enough to make repeated proc freq's expensive, then do a proc freq against it justg once, while creating the data set ALL_FREQS per below.  It will have one record per every combination of factor1*factor2 plus the new variable COUNT. 

 

Then loop  proc freq through that data set, weighting by count, to get your chi-square values.  I.e. something like:

 

 

proc freq data=&dsn  noprint ;
  where &factor1 in (&k &l);   /* editted correction - should have removed this line */
  tables &factor1*&factor2 / out=all_freqs ;
run;

 

%do k = 0 %to &n;
  %do l = &k %to &n;

ods output chisq(persist)=chisq(where=(statistic="Chi-Square")rename=(Prob=Raw_P));

proc freq data=all_freqs  ;
  where &factor1 in (&k , &L);
  tables &factor1*&factor2 / chisq cellchi2 nopercent;

  weight COUNT;
run;
* Aggregate individual tables ;
proc append base=all_chisq data=chisq force;
run;

  %end;
%end;

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 275 views
  • 2 likes
  • 4 in conversation