BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
MatSof
Calcite | Level 5

Dear all.

I hope someone can help me with a quit basic question with regards to the PROC LOGISTIC procedure:

I want to calculate an adjusted Odds Ratio. The variables that I want to use for the calculation are 1) Cardiovascular disease: yes /no  2) smoking: yes/no and 3) age 16 years-100 years+

I want to calculate the OR of being a smoker, when having a cardiovascular disease and I want to adjust this Odds Ratio estimate for age.

I have tried the following approach:

Proc logistic data=cardiosmoke;

Model cardiovaskulardisease=smoking age;

Run;

 

But I am not sure whether this code is the correct one? I would be very grateful if anyone could help me with an example of how to correctly use the PROC LOGISTIC to do this calculation??

Thank you for your time

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

If you want to compare the odds of being a smoker in the cardiovascular=yes group to the odds in the cardiovascular=no group, then SMOKING is the response variable. Assuming variables SMOKING and CVD with values "yes" and "no":

proc logistic;

class cvd(ref="no") / param=ref;

model smoking(event="yes") = cvd age;

run;

Note the use of the EVENT= option so that you are modeling the probability of SMOKING=yes rather than no.  This assures that the odds ratio will be using the odds of SMOKING=yes rather than no.  Also the use of the REF= option so that the CVD=no group is in the denominator of the odds ratio.  The Odds Ratio Estimates table will then label the odds ratio for CVD as "CVD yes vs no" and the note under the Response Profile table will show "Probability modeled is SMOKING=yes".  It is important to use these options so that you don't use the odds of the wrong level or put the wrong group in the denominator of the odds ratio.

View solution in original post

6 REPLIES 6
StatDave
SAS Super FREQ

If you want to compare the odds of being a smoker in the cardiovascular=yes group to the odds in the cardiovascular=no group, then SMOKING is the response variable. Assuming variables SMOKING and CVD with values "yes" and "no":

proc logistic;

class cvd(ref="no") / param=ref;

model smoking(event="yes") = cvd age;

run;

Note the use of the EVENT= option so that you are modeling the probability of SMOKING=yes rather than no.  This assures that the odds ratio will be using the odds of SMOKING=yes rather than no.  Also the use of the REF= option so that the CVD=no group is in the denominator of the odds ratio.  The Odds Ratio Estimates table will then label the odds ratio for CVD as "CVD yes vs no" and the note under the Response Profile table will show "Probability modeled is SMOKING=yes".  It is important to use these options so that you don't use the odds of the wrong level or put the wrong group in the denominator of the odds ratio.

MatSof
Calcite | Level 5

Dear Dave,

Thank you very much for your response - it has been of great help. I have one last question concerning the output. The following is the results I get when running the code:


proc logistic data=sasuser.cvdsmoking;

class CVD(ref="1") /param=ref;

Model Smoking(event ="2")= CVD age;

RUN;

1= No, 2 = Yes


  The LOGISTIC Procedure

                                         Model Information

Data Set                      SASUSER.cvdsmoking

Response Variable smoking

Number of Response Levels     2

Model binary logit

Optimization Technique Fisher's scoring

                              Number of Observations Read       78702

                              Number of Observations Used       69113

  Response Profile

                                 Ordered                      Total

Value        smoking     Frequency

                                       1            1         54002

                                       2            2         15111

                                  Probability modeled is smoking=2.

NOTE: 9589 observations were deleted due to missing values for the response or explanatory

      variables.

 

                                     Class Level Information

Design

                                 Class        Value     Variables

                                 Cvd         1                 0

                                             2                 1

                                     Model Convergence Status

                          Convergence criterion (GCONV=1E-8) satisfied.

                                       Model Fit Statistics

Intercept

Intercept            and

                              Criterion          Only     Covariates

                              AIC           72596.045      65900.426

                              SC            72605.189      65927.856

                              -2 Log L      72594.045      65894.426

                                         

                                      The LOGISTIC Procedure

                              Testing Global Null Hypothesis: BETA=0

Test Chi-Square       DF     Pr > ChiSq

Likelihood Ratio 6699.6191        2 <.0001

Score 6494.7233        2         <.0001

Wald 5916.4069        2         <.0001

                                    Type 3 Analysis of Effects

                                               Wald

                           Effect        DF Chi-Square    Pr > ChiSq

                           Cvd            1   15.4882        <.0001

                           Age            1 5742.7223        <.0001

                             Analysis of Maximum Likelihood Estimates

                                                          Standard     Wald

Parameter      DF    Estimate       Error      Chi-Square    Pr > ChiSq

Intercept       1      0.7702      0.0266      839.6578        <.0001

Cvd       2     1     -0.2357      0.0599       15.4882        <.0001

Age             1     -0.0446    0.000589     5742.7223        <.0001

                                       Odds Ratio Estimates

Point          95% Wald

Effect             Estimate      Confidence Limits

Cvd      2 vs 1       0.790       0.703       0.888

Age                   0.956       0.955       0.957

Association of Predicted Probabilities and Observed Responses

Percent Concordant 69.4    Somers' D    0.401

Percent Discordant 29.3    Gamma        0.407

Percent Tied        1.3    Tau-a        0.137

Pairs         816024222    c            0.701

I just want to make sure I am using the correct output. I would think that the OR adjusted for age is the highlighted numbers: OR=0.95695%CI:0.955;0.957 p.-value <.0001

Would that be a correct assumption? Or is it the result above the highlighted called ¨Cvd 2 vs 1¨ with an OR of 0.79?

Best Regards Mathilde

psj2
Calcite | Level 5

Hi Mathilde, it's the latter idea: The OR for CVD (yes vs no) of being a smoker adjusted for age can be read in the first line: It's 0.790 (95% CI 0.703 - 0.888). The highlighted part shows the corresponding numbers for an increase in 1 unit of age (adjusted for CVD) (probably its a numeric variable with unit "years", so it's the OR for an increase of 1 year of age adjusted for CVD).

MatSof
Calcite | Level 5

Thank you very much for your reply - it has been very helpful.

Reeza
Super User

Adding on to what StatDave has mentioned, this is the unadjusted rate, without age

proc logistic;

class cvd(ref="no") / param=ref;

model smoking(event="yes") = cvd;

run;

And this is the adjusted rate, by including age:

proc logistic;

class cvd(ref="no") / param=ref;

model smoking(event="yes") = cvd age;

run;

The odds ratio you want is for CVD, and then you should compare the unadjusted to adjusted to see the impact but mostly that's to be aware of the issue, not for any statistical purpose.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 8083 views
  • 5 likes
  • 5 in conversation