BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Dear Community,

I have a data set with a binomial response and a binary predictor, q.  My observations are clustered under the various values of a categorical variable, r.  I ran this following code:

proc genmod

     data = fakedata2;

     class      

               q (ref = '0')

               r;

          model successes/trials

          =          

               q

               /     dist = bin

                                         link = logit

                                         type3;

          repeated

               subject = r;

run;

My GEE estimate for Q = 1 has a P-value of 0.0008, but the Type-3 score statistic for Q has a P-value of 0.0956. 

My questions for you:

1) How is this possible?  (I've read the documentation for Type-3 analysis, and I can't figure out how this is exactly calculated.  I admit that I don't fully understand what the Type-3 analysis is doing, but, as I have learned from other statisticians, the Type-3 analysis should test for the overall significance of Q while controlling for other effects.  Since Q is my only predictor, I would expect their P-values to be similar, if not the same.

2) Given these disparate P-values, what can I conclude about Q from this model?  Is Q a significant predictor of success?  Why or why not?

Thanks for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
ABiostatistician
Calcite | Level 5

Hi Steve,

I finally learned the reason.  If I invoke the "WALD" option in the "MODEL" statement, then both the GEE and the Type-3 analysis give the same P-values.

By default, PROC GENMOD uses score tests for Type-3 analyses, and that resulted in the difference in P-values in my model.

View solution in original post

3 REPLIES 3
SteveDenham
Jade | Level 19

The type III tests are for a model that assumes that the individual time points are independent (no correlation, and from independent observations as well).  The solution serves as the initial values for the GEE, which models the correlation between time points.  Here, you model the default exchangeable correlation structure.  It is not at all surprising that the two p values differ substantially, if there is a correlation between time points within subject. That is what a GEE approach is designed to handle.

Steve Denham

ABiostatistician
Calcite | Level 5

Thanks for your helpful reply, Steve.

If my Type-3 test shows a high P-value, but my GEE estimate shows a low P-value, then which one should I trust? 

I see this type of discrepancy in 2 situations:

a) there is only 1 binary predictor, Q

b) there are 3 categorical predictors: Q (binary), X (ternary) and W (ternary).

Thanks for your insights!

ABiostatistician
Calcite | Level 5

Hi Steve,

I finally learned the reason.  If I invoke the "WALD" option in the "MODEL" statement, then both the GEE and the Type-3 analysis give the same P-values.

By default, PROC GENMOD uses score tests for Type-3 analyses, and that resulted in the difference in P-values in my model.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1459 views
  • 3 likes
  • 2 in conversation