BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
anhl1206
Fluorite | Level 6

I am trying to create a macro to loop through all possible pairwise interactions in a regression analysis. I am using the Imbens and Rubin approach to estimating a propensity score. I have selected my main effects and now I want to test for possible pairwise interactions to retain in the regression model. I need to include a number of forced variables (currently written for 12) and then test one pairwise interaction at a time. 

 

All but one (momage) of my variables are dichotomous (dummy) vars, but I want the macro to be able to handle either dichotomous or quantitative variables. e.g. iterate through all possible interactions and I can ignore the ones I don't need. 

 

In the attached code, I have tried to pass my main effects into an array, and then use a do loop to generate the interaction variables from the array. Those array variables should then pass into the macro and repeat until all pairwise interactions are tested. 

 

I get two errors with this code

1) 

NOTE: The SAS System stopped processing this step because of errors.
NOTE: SAS set option OBS=0 and will continue to check statements.
This may cause NOTE: No observations in data set.
WARNING: The data set WORK.WNHEST may be incomplete. When this step was stopped there were 0
observations and 161 variables.

2) 

MLOGIC(PAIRWISEINT): Ending execution.
ERROR 117-185: There were 2 unclosed DO blocks.

 

I have read the information about using the BY statement with a transformed dataset (https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html) but I don't see how to get the interactions fed into the regressions this way. 

 

I thought about using %DO within the macro but the order didn't make sense to me that way, I think it should be array -> generate interaction terms -> terms go into macro input and then used in regression.

 

If there are any recommendations on additional reading I welcome those too. Thanks in advance!

SAS Software 9.2 (TS2M3)
Linux 2.6.18-402.el5PAE (LINUX) platform

 

data test;
	input msdp momage nomarnopn nomarpn marnopn momedu1 momedu3 momedu4 momedu5 bmi_un bmi_ov bmi_ob moborn;
	CARDS;
1 19 1 0 0 1 0 0 0 0 0 0 1
1 20 0 1 0 0 0 0 0 1 0 0 0
1 21 0 0 1 0 0 1 0 0 1 0 1
1 23 0 1 0 0 0 0 1 0 0 1 0
1 25 1 0 0 0 0 1 0 0 1 0 1
0 27 0 1 0 0 1 0 0 1 0 0 0
0 29 0 0 0 0 0 0 0 0 0 0 1
0 31 0 1 0 1 0 0 0 1 0 0 0
0 33 0 0 1 0 1 0 0 0 1 0 1
0 35 0 0 0 0 0 0 0 0 0 1 0
;
RUN;

 

OPTIONS mprint symbolgen mlogic spool;
%MACRO pairwiseint(data=, x1=, x2=, x3=, x4=, x5=, x6=, x7=, x8=, x9=, x10=, x11=, x12=, xi= , xj= , nummainefx=, outcome=, penter=, pkeep=);
proc logistic data=&data DESCENDING;
 class
   &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12
   / PARAM=REF;
 model &outcome(event='1') = &x1 &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12
  &xi*&xj
  /include=&nummainefx selection=stepwise slentry=&penter slstay=&pkeep
  details
  lackfit;
%MEND pairwiseint;

*set arrays for macro variables;
*keep xi the same;
*the number is the number of main effects in the model;
*then define main effect variables;
data wnhest;
        set wnhest;
array main {12} momage nomarnopn nomarpn marnopn momedu1 momedu3 momedu4 momedu5 bmi_un bmi_ov bmi_ob moborn;

DO i = 1 to 12 by 1;
  Do j = i to 12 by 1;
   xi=main[i];
   xj=main[j];
%pairwiseint(data=wnhest, x1=momage, x2=nomarnopn, x3=nomarpn, x4=marnopn, x5=momedu1, x6=momedu3, x7=momedu4, x8=momedu5, x9=bmi_un, x10=bmi_ov, x11=bmi_ob, x12=moborn, xi=xi, xj=xj, nummainefx=12, outcome=msdp, penter=.35, pkeep=.95);
  end;
end;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Figure out what SAS code you want to generate. Then you can design your macro.

Looks like you want something like:

%MACRO pairwiseint
(data=
,vars=
,nummainefx=
,outcome=
,penter=
,pkeep=
);
%local i j;
%do i=1 %to %sysfunc(countw(&vars,%str( )));
%do j=%eval(&i+1) %to %sysfunc(countw(&vars,%str( )));
proc logistic data=&data DESCENDING;
 class &vars / PARAM=REF;
 model &outcome(event='1') = &vars %scan(&vars,&i,%str( ))*%scan(&vars,&j,%str( ))
  /include=&nummainefx selection=stepwise slentry=&penter slstay=&pkeep
  details
  lackfit;
run;
%end ;
%end;
%MEND pairwiseint;

View solution in original post

11 REPLIES 11
Tom
Super User Tom
Super User

You cannot write a PROC statement in the middle of DATA step.

Tom
Super User Tom
Super User

Figure out what SAS code you want to generate. Then you can design your macro.

Looks like you want something like:

%MACRO pairwiseint
(data=
,vars=
,nummainefx=
,outcome=
,penter=
,pkeep=
);
%local i j;
%do i=1 %to %sysfunc(countw(&vars,%str( )));
%do j=%eval(&i+1) %to %sysfunc(countw(&vars,%str( )));
proc logistic data=&data DESCENDING;
 class &vars / PARAM=REF;
 model &outcome(event='1') = &vars %scan(&vars,&i,%str( ))*%scan(&vars,&j,%str( ))
  /include=&nummainefx selection=stepwise slentry=&penter slstay=&pkeep
  details
  lackfit;
run;
%end ;
%end;
%MEND pairwiseint;
anhl1206
Fluorite | Level 6

Thanks very much for this suggestion! I will work with this for now. Thanks again for the help!

data_null__
Jade | Level 19

@anhl1206 wrote:

I am trying to create a macro to loop through all possible pairwise interactions in a regression analysis. I am using the Imbens and Rubin approach to estimating a propensity score. I have selected my main effects and now I want to test for possible pairwise interactions to retain in the regression model. I need to include a number of forced variables (currently written for 12) and then test one pairwise interaction at a time. 

 

All but one (momage) of my variables are dichotomous (dummy) vars, but I want the macro to be able to handle either dichotomous or quantitative variables. e.g. iterate through all possible interactions and I can ignore the ones I don't need. 

 

In the attached code, I have tried to pass my main effects into an array, and then use a do loop to generate the interaction variables from the array. Those array variables should then pass into the macro and repeat until all pairwise interactions are tested. 

 

I get two errors with this code

1) 

NOTE: The SAS System stopped processing this step because of errors.
NOTE: SAS set option OBS=0 and will continue to check statements.
This may cause NOTE: No observations in data set.
WARNING: The data set WORK.WNHEST may be incomplete. When this step was stopped there were 0
observations and 161 variables.

2) 

MLOGIC(PAIRWISEINT): Ending execution.
ERROR 117-185: There were 2 unclosed DO blocks.

 

I have read the information about using the BY statement with a transformed dataset (https://blogs.sas.com/content/iml/2017/02/13/run-1000-regressions.html) but I don't see how to get the interactions fed into the regressions this way. 

 

I thought about using %DO within the macro but the order didn't make sense to me that way, I think it should be array -> generate interaction terms -> terms go into macro input and then used in regression.

 

If there are any recommendations on additional reading I welcome those too. Thanks in advance!

SAS Software 9.2 (TS2M3)
Linux 2.6.18-402.el5PAE (LINUX) platform

 

data test;
	input msdp momage nomarnopn nomarpn marnopn momedu1 momedu3 momedu4 momedu5 bmi_un bmi_ov bmi_ob moborn;
	CARDS;
1 19 1 0 0 1 0 0 0 0 0 0 1
1 20 0 1 0 0 0 0 0 1 0 0 0
1 21 0 0 1 0 0 1 0 0 1 0 1
1 23 0 1 0 0 0 0 1 0 0 1 0
1 25 1 0 0 0 0 1 0 0 1 0 1
0 27 0 1 0 0 1 0 0 1 0 0 0
0 29 0 0 0 0 0 0 0 0 0 0 1
0 31 0 1 0 1 0 0 0 1 0 0 0
0 33 0 0 1 0 1 0 0 0 1 0 1
0 35 0 0 0 0 0 0 0 0 0 1 0
;
RUN;

 

OPTIONS mprint symbolgen mlogic spool;
%MACRO pairwiseint(data=, x1=, x2=, x3=, x4=, x5=, x6=, x7=, x8=, x9=, x10=, x11=, x12=, xi= , xj= , nummainefx=, outcome=, penter=, pkeep=);
proc logistic data=&data DESCENDING;
 class
   &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12
   / PARAM=REF;
 model &outcome(event='1') = &x1 &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12
  &xi*&xj
  /include=&nummainefx selection=stepwise slentry=&penter slstay=&pkeep
  details
  lackfit;
%MEND pairwiseint;

*set arrays for macro variables;
*keep xi the same;
*the number is the number of main effects in the model;
*then define main effect variables;
data wnhest;
        set wnhest;
array main {12} momage nomarnopn nomarpn marnopn momedu1 momedu3 momedu4 momedu5 bmi_un bmi_ov bmi_ob moborn;

DO i = 1 to 12 by 1;
  Do j = i to 12 by 1;
   xi=main[i];
   xj=main[j];
%pairwiseint(data=wnhest, x1=momage, x2=nomarnopn, x3=nomarpn, x4=marnopn, x5=momedu1, x6=momedu3, x7=momedu4, x8=momedu5, x9=bmi_un, x10=bmi_ov, x11=bmi_ob, x12=moborn, xi=xi, xj=xj, nummainefx=12, outcome=msdp, penter=.35, pkeep=.95);
  end;
end;
run;

 

can you show what your code is supposed to do?

 

I don't think you need all that macro stuff

 

data test;
	input msdp momage nomarnopn nomarpn marnopn momedu1 momedu3 momedu4 momedu5 bmi_un bmi_ov bmi_ob moborn;
	CARDS;
1 19 1 0 0 1 0 0 0 0 0 0 1
1 20 0 1 0 0 0 0 0 1 0 0 0
1 21 0 0 1 0 0 1 0 0 1 0 1
1 23 0 1 0 0 0 0 1 0 0 1 0
1 25 1 0 0 0 0 1 0 0 1 0 1
0 27 0 1 0 0 1 0 0 1 0 0 0
0 29 0 0 0 0 0 0 0 0 0 0 1
0 31 0 1 0 1 0 0 0 1 0 0 0
0 33 0 0 1 0 1 0 0 0 1 0 1
0 35 0 0 0 0 0 0 0 0 0 1 0
;
RUN;
proc print;
   run;
%let x=momage nomarnopn nomarpn marnopn momedu1 momedu3 momedu4 momedu5 bmi_un bmi_ov bmi_ob moborn;
%let xat=%sysfunc(transtrn(%superq(x),%str( ),|));
%put NOTE: &=xat; 
proc logistic;
   *class &x / PARAM=REF;;
   model msdp = &xat@2 /stepwise;
   run;
anhl1206
Fluorite | Level 6

Thanks for this suggestion! 

 

I have tried the | for all pairwise interactions, but this doesn't let me use the INCLUDE option in the regression in the correct format. I currently have a macro working for what I need like this: 

%MACRO testint6(data=, x1=, x2=, x3=, x4=, x5=, x6=, x7=, x8=, x9=, x10=, x11=, x12=, x13=, x14=, x15=, x16=, x17=, nummainefx=, outcome=, penter=, pkeep=);

*4;
proc logistic data=&data DESCENDING;
        class
           &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12
                / PARAM=REF;
        model &outcome(event='1') = &x1 &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12 &x13 &x14 &x15 &x16 &x17
                &x1*&x4
        /include=&nummainefx selection=stepwise slentry=&penter slstay=&pkeep
        details
        lackfit;
run;
*5;
proc logistic data=&data DESCENDING;
        class
           &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12
                / PARAM=REF;
        model &outcome(event='1') = &x1 &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12 &x13 &x14 &x15 &x16 &x17
                &x1*&x5
        /include=&nummainefx selection=stepwise slentry=&penter slstay=&pkeep
        details
        lackfit;
run;
...

*67;
proc logistic data=&data DESCENDING;
class
&x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12
/ PARAM=REF;
model &outcome(event='1') = &x1 &x2 &x3 &x4 &x5 &x6 &x7 &x8 &x9 &x10 &x11 &x12 &x13 &x14 &x15 &x16 &x17
&x11*&x12
/include=&nummainefx selection=stepwise slentry=&penter slstay=&pkeep
details
lackfit;
run;
%mend testint6;

%testint6(data=wnhest, x1=momage, x2=nomarnopn, x3=nomarpn, x4=marnopn, x5=momedu1, x6=momedu3, x7=momedu4,
x8=momedu5, x9=bmi_un, x10=bmi_ov, x11=bmi_ob, x12=moborn, x13=momage*momage, x14=nomarpn*momedu1, x15=momage*nomarnopn,
x16=momage*nomarpn, x17=momage*momedu4, nummainefx=17, outcome=msdp, penter=.35, pkeep=.95);

I always need the x1-x12 variables included, and then one model at a time test one interaction between x1-x12 vars. The existing macro requires a lot of copy/pasting to change the MODEL line every time an interaction is retained. 

I will look into feeding in the x1-x12 as a list instead of specifying in the macro every time. 

ballardw
Super User

Since you want the names of the variables

you do not want anything like this :

   xi=main[i]; xj=main[j];

as that gets the values of the variables and not the names.

 

Using a set statement you would be generating one call to your macro for each record in the data set. Which would be terribly redundant.

 

Your loop is inefficient as you would be using msdp * momage    and   momage * msdp in the interactions, which other that order of appearance in the output will be awful similar in result.

 

Please go back to the designing board, use something with only 3 or maybe 4 variables and show what PROC LOGISTIC code you need to generate for that example.

 

I don't get what you were attempting to do with your Test data set because you did not use it anywhere.

 

Likely you do not need anywhere that many macro variables. I see a need for one macro variable to hold the names of CLASS variables, another to hold other variables on the model, X1 to Xn for each single variable.

 

For your consideration:

%macro pairloop (ovars= , classvars=);
   %let varlist = &ovars. &classvars.;
   %do i = 1 %to %eval(%sysfunc(countw(&varlist.))-1);
      %do j= %eval(&i.+1) %to %sysfunc(countw(&varlist.));
      %let var1= %scan(&varlist.,&i.);
      %let var2= %scan(&varlist.,&j.);
      %put var1 is &var1. , var2 is &var2.;
      /* HERE is where the proc logistic code would go using the variables*/
      %end;
   %end;
%mend;

%pairloop (ovars= msdp , classvars= momage nomarnopn nomarpn marnopn momedu1 momedu3 )

 

 

anhl1206
Fluorite | Level 6

Thanks for the explanations and the suggestions, I will go back to the drawing board.

PaigeMiller
Diamond | Level 26

From a statistical point of view, I am skeptical that this is a valid method of determining what terms should go in the model. I have never seen anyone try to fit each possible interaction, one at a time, with all of the main effects, into a sequence of models. Not only would it not be clear what to do with this information once you perform the analysis, I think you will run into a lot of the same problems that plague stepwise regression, and so I feel that there must be a better approach, but since I don't really understand what your goal is, I can't advise further.

 

Bottom line: just because you (or someone) can program this, doesn't mean you should program this.

--
Paige Miller
anhl1206
Fluorite | Level 6

Thanks for the comment, and in pretty much any other case I agree that this would be inappropriate. The goal of this analysis is to estimate a propensity score, which is then used in subsequent analysis with the subclassification approach (split into quantiles and analysis happens within quantiles, to achieve balanced comparisons). Following the Imbens and Rubin approach, the rationale is that the balancing of covariates is the goal of this step of the analysis. Any hypothesis testing occurs after the covariate balancing by the propensity score. 

PaigeMiller
Diamond | Level 26

Ok, so that is an application I am not familiar with. Does not the code provided by @Tom in message 3 solve the problem?

--
Paige Miller
anhl1206
Fluorite | Level 6

I was able to implement this and it worked! Thanks to all for the suggestions and @Tom  for the ultimate solution, they were helpful to me learning how to use the macro language in my analyses!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2466 views
  • 9 likes
  • 5 in conversation