Solved: Proc logistics vs chi square

lredon · Posted 04-18-2024 09:10 AM

Hello all, I'm running an univariate logistic model. With the following syntax I'm getting that the risk of event if 70% lower in patients who smoke. However, when I switch the reference category for smoker variable, I get almost the same odds: 0.313 (0.207 - 0.471) vs 0.336 (0.233 - 0.486)

I have explored the data and performed a chi-square test, and I did not find any differences between the categories (which makes sense). This is the contingency table:

Table of SMOKER by EVENTSMOKER EVENT0 1 TotalNoYesTotal

113

40.79

74.83

54.07

38

13.72

25.17

55.88

151

54.51

96

34.66

76.19

45.93

30

10.83

23.81

44.12

126

45.49

209

75.45

68

24.55

277

100.00

Does anybody know why this happens? Why does the logistic model fail? Thank you in advance.

StatDave · Posted 04-18-2024 10:41 AM

Your code with PARAM=REF and NOINT estimates the odds in the SMOKE=Yes category which is .2381/.7619. You can estimate the odds ratio if you just change to PARAM=GLM. The results from the following give the same results for the odds ratio from LOGISTIC and FREQ. With NOINT, the two parameter estimates are the log odds for the two SMOKE groups. Their difference is the log odds ratio. Exponentiating the difference is the odds ratio.

data a; 
do smoke='n','y';
do event=0,1;
 input f@@;
 output;
end; end;
datalines;
113 38
96 30
;
proc logistic data=a;
freq f;
class SMOKE(ref='n') / param = glm;
model EVENT(event='1')= SMOKE/ noint ;
run;
proc freq data=a;
weight f;
table smoke*event/relrisk;
run;

View solution in original post

PaigeMiller · Posted 04-18-2024 09:15 AM

Impossible to read this table. Make a screen capture and then click on the "Insert Photos" icon here and include it in your reply. Also show us the PROC FREQ code used. Please also show us the code (as text) and the output (as a screen capture) from PROC LOGISTIC.

--
Paige Miller

lredon · Posted 04-18-2024 09:23 AM

Hi, I attach the contingency table using proc freq and also de code and results for the logistic model:

proc logistic with smoker=No:

proc logistic data=An.data;
class SMOKER(ref='No') / param = REF;
model EVENT(event='1')= SMOKER/selection=none noint ctable rsquare outroc=rocdata2;
run;

proc logistic with reference category smoker=Yes:

proc logistic data=An.data;
class SMOKER(ref='Yes') / param = REF;
model EVENT(event='1')= SMOKER/selection=none noint ctable rsquare outroc=rocdata2;
run;

Thanks

ballardw · Posted 04-18-2024 09:55 AM

The ratios between the events for the Smoker=Yes and Smoker=No (38/113 and 30/96) are vary similar so would expect only a small change in the odds ratio.

You might compare that 38/113 = 0.336283 and 30/96= 0.3125 to the point estimate of your ratio. See any similarity after rounding to 3 decimal places?

lredon · Posted 04-18-2024 10:17 AM

Okay I get that, but how is it possible that the smoker variable is statistically significative inside my logistic model when the % of subjects in each category is so similar between subjects with event and subjects with no event?

StatDave · Posted 04-18-2024 10:41 AM

Your code with PARAM=REF and NOINT estimates the odds in the SMOKE=Yes category which is .2381/.7619. You can estimate the odds ratio if you just change to PARAM=GLM. The results from the following give the same results for the odds ratio from LOGISTIC and FREQ. With NOINT, the two parameter estimates are the log odds for the two SMOKE groups. Their difference is the log odds ratio. Exponentiating the difference is the odds ratio.

data a; 
do smoke='n','y';
do event=0,1;
 input f@@;
 output;
end; end;
datalines;
113 38
96 30
;
proc logistic data=a;
freq f;
class SMOKE(ref='n') / param = glm;
model EVENT(event='1')= SMOKE/ noint ;
run;
proc freq data=a;
weight f;
table smoke*event/relrisk;
run;

lredon · Posted 04-19-2024 04:35 AM

Thank you so much! Very useful

Reeza · Posted 04-18-2024 10:54 AM

Your chi square references a variable event6_cat and your model references a variable event.

I would have expected to see the same variable in both outputs for the comparison.

@lredon wrote:

Hi, I attach the contingency table using proc freq and also de code and results for the logistic model:

proc logistic with smoker=No:

proc logistic data=An.data;
class SMOKER(ref='No') / param = REF;
model EVENT(event='1')= SMOKER/selection=none noint ctable rsquare outroc=rocdata2;
run;

proc logistic with reference category smoker=Yes:

proc logistic data=An.data;
class SMOKER(ref='Yes') / param = REF;
model EVENT(event='1')= SMOKER/selection=none noint ctable rsquare outroc=rocdata2;
run;

Thanks

lredon · Posted 04-18-2024 11:00 AM

Sorry both are the same variable, just used a different when I paste the code here.

StatDave · Posted 04-18-2024 01:33 PM

Did you see my earlier response? It should answer your concern.

Ksharp · Posted 04-18-2024 11:28 PM

As @StatDave said , delete option NOINT , you would get the same result from PROC FREQ .

Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

Re: Proc logistics vs chi square

The 2025 SAS Hackathon has begun!