How to code for many regression models with similar variables

Accepted Solution Solved
Reply
Contributor
Posts: 44
Accepted Solution

How to code for many regression models with similar variables

Dear SAS Community,

I'm trying to calculate logistic regression odds ratios (associations) between 20 different chemicals and 9 different cancer sites in my dataset.

I would like to use macro language or proc sql to efficiently code for the 180 logistic regression models.

My variables are person id (ID), sites A-I (site), and chemical exposures 1-20 (exp). Here is an idea of what my data looks like.  The variables for cancer sites are coded as dummy variables (1= case, 0=non-case). 

The ellipsis (...) indicates that I'm only showing 1 cancer site variable instead of all 9, only 1 chemical out of 20, and only 5 subjects from the whole study.   

ID     SiteA ...     Exp1 ...

1     1                    10.5

2     0                    0.5

3     0                    6

4     0                    1.3

5     1                    12.2

...

My basic model looks like this:

proc logistic data = analysis;

model &site (event="1")= &exp; *Odds ratios for chemicals as continuous variables;

run;

Could someone please suggest how I can write concise code for all these tests using the same underlying model? Thank you.


Accepted Solutions
Solution
‎07-31-2013 09:03 PM
Super Contributor
Posts: 297

Re: How to code for many regression models with repeating variables

Hi TJ,

I haven't tested the attached as you haven't provided any data, but hopefully it makes sense.  If you have each of your scenarios in a dataset, which I assume you can achieve on your own, using a Call Execute will allow for you to use values in the relevant variables as part of your statements, functions and the like.

DATA _NULL_;

SET SAMPLES (END=EOF);

STR =COMPBL( "PROC LOGISTIC DATA = HAVE; MODEL "|| SITE || " (EVENT='1')=" || EXP || ";";

CALL EXECUTE (STR);

IF EOF THEN DO;

STR = 'RUN;';

CALL EXECUTE (STR);

END;

RUN;

The following gives you a simple example of code that works.  We are accessing the Dictionary View VCOLUMN to identify all the columns in SASHELP.CLASS and executing a sort on each of the variables.

DATA _NULL_;

SET SASHELP.VCOLUMN (WHERE =(LIBNAME = "SASHELP" AND MEMNAME = "CLASS")) END=EOF;

STR =COMPBL( "PROC SORT DATA = SASHELP.CLASS OUT = SORTCLASS ; BY " || NAME || ";");

CALL EXECUTE (STR);

IF EOF THEN DO;

STR = 'RUN;';

CALL EXECUTE (STR);

END;

RUN;

View solution in original post


All Replies
Solution
‎07-31-2013 09:03 PM
Super Contributor
Posts: 297

Re: How to code for many regression models with repeating variables

Hi TJ,

I haven't tested the attached as you haven't provided any data, but hopefully it makes sense.  If you have each of your scenarios in a dataset, which I assume you can achieve on your own, using a Call Execute will allow for you to use values in the relevant variables as part of your statements, functions and the like.

DATA _NULL_;

SET SAMPLES (END=EOF);

STR =COMPBL( "PROC LOGISTIC DATA = HAVE; MODEL "|| SITE || " (EVENT='1')=" || EXP || ";";

CALL EXECUTE (STR);

IF EOF THEN DO;

STR = 'RUN;';

CALL EXECUTE (STR);

END;

RUN;

The following gives you a simple example of code that works.  We are accessing the Dictionary View VCOLUMN to identify all the columns in SASHELP.CLASS and executing a sort on each of the variables.

DATA _NULL_;

SET SASHELP.VCOLUMN (WHERE =(LIBNAME = "SASHELP" AND MEMNAME = "CLASS")) END=EOF;

STR =COMPBL( "PROC SORT DATA = SASHELP.CLASS OUT = SORTCLASS ; BY " || NAME || ";");

CALL EXECUTE (STR);

IF EOF THEN DO;

STR = 'RUN;';

CALL EXECUTE (STR);

END;

RUN;

Contributor
Posts: 44

Re: How to code for many regression models with repeating variables

Posted in reply to Scott_Mitchell

Thank you very much for your suggested code. I found this link to be especially helpful for testing multiple independent variables. The authors also run the 'call execute' subroutine that you recommend. 

SAS Super FREQ
Posts: 3,755

Re: How to code for many regression models with similar variables

Perhaps I'm missing something, but if you just want 9 different independent analyses, transform the data into a long data set with SITE=A-I and use a BY statement with SITE as the BY variable.

If you want to specify a single model that incorporates the sites, you can specify SITE as a CLASS variable and include interactions between the variables and the sites.

Lastly, if SITE is a random variable, rather than a fixed effect, you might want to look at using PROC GLIMMIX instead of PROC LOGISTIC. has posted many examples of using PROC GLIMMIX.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 204 views
  • 0 likes
  • 3 in conversation