Automated way of repeating univariate logistic regression and putting individual results in a table

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Automated way of repeating univariate logistic regression and putting individual results in a table

[ Edited ]

Hello,

I am not a programmer nor a statistician. So I hope my question may be simple to solve.

I am using a macro for repeating univariate logistic regression in SAS (with the same outcome variable status and different predictor variables). (link for the macro: http://rstudio-pubs-static.s3.amazonaws.com/2989_ceae90d128554c728d5388439adf0661.html ) In this macro I also want to include commands that will help me create a single table showing variable name, odds ratio, lower and upper 95% confidence limits and the p value for all the variables. I am running a logistic regression to see if presence of fatty liver is associated with age, bmi, smoking(0-1) and alcohol use(0-1). I have many more predictor variables, which I treat as continuous, to include in logistic regression…

I have made up the SAS code utilizing proc sql statements myself and included in the above mentioned macro under two headings (/* Combining variable name, respective odds ratio its confidence limits and p value in a single row */ AND /* Combining the above obtained rows in table by union */).

I want to have an output table that includes the results of the logistic regression for all variables in a table called STATSALL3. The following SAS code enables me to get the rows of this table one by one but not combined. So the last proc sql satement (“Combining the rows in table by union”) does not work. In this part of the syntax I have included the table statsall2 that has no observations but only variable names (VARIABLE, EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq). I have previously created this table with a syntax like this:

proc sql;

   create table WORK.STATSALL2

      like WORK.STATAGE;

I wonder if it is possible to incorporate an appropriate syntax into this macro so as to put lines of information obtained from logistic regression for every other variable in a single table. Thank you very much for your help.  

 

%macro logistrepeat(values);
    %local i j ;                         %* # i for variable name, j for position;
    %let j = 1 ;
    %do %while(%scan(&values, &j) ne ) ; %* # repeat while jth value is NotEqual empty ;
        %let i = %scan(&values, &j) ;    %* # Insert jth element to variable i;
 
        /* Logistic regression procedure below */
          Title "Model with -- &i -- as predictor";
 
          ODS OUTPUT PARAMETERESTIMATES=ESTIMATES&i   OddsRatios=OR&i ;
              proc logist descending data= sasuser.labdata ;
          model fattyliver = &i;             %* # Variable name inserted here;
          run;
              ODS OUTPUT CLOSE;
              ODS HTML;
 
              PROC PRINT DATA=ESTIMATES&i;
              TITLE "Output Data Set";
              RUN;
              ODS HTML CLOSE;
 
        /* Combining variable name, respective odds ratio its confidence limits and p value in asingle row */
 
              PROC SQL;
              CREATE TABLE STAT&i AS
              SELECT VARIABLE, EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq
              FROM WORK.ESTIMATES&i as ESTIMATES, WORK.OR&i as OR
              WHERE ESTIMATES.variable=OR.EFFECT ;
              QUIT;
 
        /* Combining the above obtained rows in table by union */
 
              PROC SQL;
              CREATE TABLE STATSALL3 AS
              select VARIABLE, EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq from WORK.STATSALL2
              union
              select VARIABLE, EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq from WORK.STAT&i;
              QUIT;
 
         %let j = %eval(&j+1) ;          %* # Increase the value of j by 1;
    %end ;
%mend ;
 
/* Run the macro */
%logistrepeat(age bmi smoking alcohol );

 


Accepted Solutions
Solution
‎05-10-2016 05:40 AM
Trusted Advisor
Posts: 1,116

Re: Automated way of repeating univariate logistic regression and putting individual results in a ta

[ Edited ]

Hello @AYDIN,

 

Thanks to your excellent preparations, only a slight modification of the last PROC SQL step is necessary, actually a simplification:

PROC SQL;
INSERT INTO STATSALL2
select VARIABLE, EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq from STAT&i;
QUIT;

(I've adopted your code writing style [capitalization] for this step.)

 

This adds each of the 1-observation datasets to the template dataset STATSALL2.

 

Please adapt the comment ("... by union") correspondingly.

 

EDIT: In template dataset STATSALL2, character variables VARIABLE and EFFECT must have sufficient lengths to accommodate all values from the STAT&i datasets, as pointed out by Reeza. Otherwise, truncation and warnings in the log will occur:

WARNING: Character expression will be truncated when assigned to character column Variable.
WARNING: Character expression will be truncated when assigned to character column Effect.

 

View solution in original post


All Replies
Solution
‎05-10-2016 05:40 AM
Trusted Advisor
Posts: 1,116

Re: Automated way of repeating univariate logistic regression and putting individual results in a ta

[ Edited ]

Hello @AYDIN,

 

Thanks to your excellent preparations, only a slight modification of the last PROC SQL step is necessary, actually a simplification:

PROC SQL;
INSERT INTO STATSALL2
select VARIABLE, EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq from STAT&i;
QUIT;

(I've adopted your code writing style [capitalization] for this step.)

 

This adds each of the 1-observation datasets to the template dataset STATSALL2.

 

Please adapt the comment ("... by union") correspondingly.

 

EDIT: In template dataset STATSALL2, character variables VARIABLE and EFFECT must have sufficient lengths to accommodate all values from the STAT&i datasets, as pointed out by Reeza. Otherwise, truncation and warnings in the log will occur:

WARNING: Character expression will be truncated when assigned to character column Variable.
WARNING: Character expression will be truncated when assigned to character column Effect.

 

New Contributor
Posts: 2

Re: Automated way of repeating univariate logistic regression and putting individual results in a ta

Thank you very much for the responses.

I have tried all 3 suggestions the INSERT approach by , the proc append approach by Reeza and using a dataset list in a SET statement recommended by PG Stats.

All 3 approaches worked for me.

It may be esasier for others who want to do the same thing to use the proc append approach or using a dataset list in a SET statement as this will avoid my step of creating a template dataset.

I appreciate your help

Best regards

 

 

Respected Advisor
Posts: 4,817

Re: Automated way of repeating univariate logistic regression and putting individual results in a ta

You can use a dataset list in a SET statement:

 

....

			PROC SQL;
			CREATE TABLE STAT_&i AS
			SELECT VARIABLE, EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq
			FROM WORK.ESTIMATES&i as ESTIMATES, WORK.OR&i as OR
			WHERE ESTIMATES.variable=OR.EFFECT ;
			QUIT;        /* Combining the above obtained rows in table by union */
 
         %let j = %eval(&j+1) ;          %* # Increase the value of j by 1;
    %end ;
	
	data STATSALL3;
	set STAT_:;
	run;
	
%mend ;

 Just be careful not to create a name conflict with the prefix (here STAT_) that you choose for your dataset names.

PG
Trusted Advisor
Posts: 1,116

Re: Automated way of repeating univariate logistic regression and putting individual results in a ta

@AYDIN: Good idea from @PGStats (as always). Please note that the elegant dataset list STAT_: is the alphabetical list of the work datasets whose names start with "STAT_". So, unlike my INSERT approach, this technique would (in general) not maintain the order of the predictor variables in macro parameter VALUES, but sort them alphabetically in dataset STATSALL3. But maybe you don't care about the order anyway or you would even prefer alphabetical order. Otherwise, of course, with a bit more effort, one could create a list of the STAT_xxx dataset names (to be used in the SET statement of the data step) that reflects the order of predictors in VALUES.

Super User
Posts: 19,157

Re: Automated way of repeating univariate logistic regression and putting individual results in a ta

I'm going to suggest a slightly different solution Smiley Happy 

Use Proc Append. Your code below is modified. 

The only other suggestion would be to assign a fixed length to your variable name so you don't run into truncation issues on the variable name or any warnings or errors. I'm not sure I did in properly within the PROC SQL here, so you may need to modify it. 

 

It looks like you were missing a quotation mark on your title statement? Not sure if that was a copy/paste error. 

 

Good Luck and hope this helps. 

 

%macro logistrepeat(values);
    %local i j ;                         %* # i for variable name, j for position;
    %let j = 1 ;
    %do %while(%scan(&values, &j) ne ) ; %* # repeat while jth value is NotEqual empty ;
        %let i = %scan(&values, &j) ;    %* # Insert jth element to variable i;
 
        /* Logistic regression procedure below */
          "Title "Model with -- &i -- as predictor";
 
          ODS OUTPUT PARAMETERESTIMATES=ESTIMATES&i   OddsRatios=OR&i ;
              proc logist descending data= sasuser.labdata ;
          model fattyliver = &i;             %* # Variable name inserted here;
          run;
              ODS OUTPUT CLOSE;
              ODS HTML;
 
              PROC PRINT DATA=ESTIMATES&i;
              TITLE "Output Data Set";
              RUN;
              ODS HTML CLOSE;
 
        /* Combining variable name, respective odds ratio its confidence limits and p value in asingle row */
 
              PROC SQL;
              CREATE TABLE STAT&i AS
              SELECT VARIABLE length=100., EFFECT, OddsRatioEst, LowerCL, UpperCL, ProbChiSq
              FROM WORK.ESTIMATES&i as ESTIMATES, WORK.OR&i as OR
              WHERE ESTIMATES.variable=OR.EFFECT ;
              QUIT;
 
        /* Combining the above obtained rows in table by union */
 
      proc append base=statsall3 data=stat&i force;
      run;

 
/* Run the macro */
%logistrepeat(age bmi smoking alcohol );
Trusted Advisor
Posts: 1,116

Re: Automated way of repeating univariate logistic regression and putting individual results in a ta

Well spotted by Reeza! The length issue is indeed relevant here for both character variables involved: VARIABLE and EFFECT. I had overlooked this. For the INSERT approach to work properly it's required that VARIABLE and EFFECT in template dataset STATSALL2 already have sufficient lengths. The PROC APPEND approach does not have this requirement. (Although I'm more familiar with PROC APPEND, I had decided for PROC SQL with INSERT in order to use as much of the existing code as possible.)

 

Two quotation marks should be sufficient for the TITLE statement. In fact, TITLE (and FOOTNOTE) statements work even without quotation marks around the title (or footnote).

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 617 views
  • 5 likes
  • 4 in conversation