BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Minhtrang
Obsidian | Level 7

Hi everyone,

 

I've been running a GEE model for binary outcome. The GEE model includes 4 predictors.

Data is longitudinal (3 repeated measurements).

I was asked for checking the "Correlation coefficient boundaries" before selecting the Correlation matrix.

I did not know about this issue before. I just know some principles when selecting the correlation matrix (independent, unstructured, ....)

My questions are:

1. What are correlation coefficient boundaries ?

2. What is the relationship between "Correlation coefficient boundaries" and some Correlation matrix (independent, unstructured, ....)

3. What are methods in SAS for checking correlation coefficient boundaries? Is it necessary to check those boundaries? Or SAS uses other methods for selecting the correlation matrix for GEE model with binary outcome?

 

I'm very thankful for your great support!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
MichaelL_SAS
SAS Employee

Glad to help.

 

To your follow up question, an odds ratio estimate, for example as computed by the LSMEANS statement, is typically used to assess the effect of a predictor from the response model on the outcome on the odds ratio scale. Alternating logistic regression (ALR) on the other hand uses a log odds ratio model to model the association between observations within a cluster instead of estimating the correlations for a given working correlation structure. As described in the “Alternating Logistic Regression” section of the documentation, the log odds ratio model estimates regression parameters for coefficients that are fixed based on the structure you have specified by using the LOGOR= option. I am not aware of any similar method like ALR for data with a Poisson response, and am not sure if one would be necessary.

 

If you are looking to use a modified Poisson approach to estimate the relative risk, associated with a predictor in the response model, for binary response data you might look at the discussion of Zou’s method at the end of this Usage Note and the references it provides.  

View solution in original post

6 REPLIES 6
MichaelL_SAS
SAS Employee

Good questions.

 

For binary data, the correlation between two observations is going to be constrained by their means. This relationship is not accounted for in the moment based estimator GEE models typically use for the working correlation matrix. As a result you, could end up with predicted means from the response model and a working correlation matrix estimate that violate the range restrictions.

 

Often these potential range restriction violations are ignored since for most applications of GEE models the association between the observations is treated as a nuisance parameter and the primary scientific interest is in the response model.

 

In situations where the the association between observations might be of interest, or you are especially concerned about possible range violations for some reason, one alternative would be to use alternating logistic regression. This approach models the association between pairs of binary observations by using a model for the logarithm of the odds ratio instead of correlations. Both PROC GEE and PROC GENMOD support the use of alternating logistic regression for binary response data. PROC GEE also supports an extension of alternating logistic regression for ordinal response data. To request the use of alternating logistic regression in either procedure, you would use the LOGOR= option instead of the TYPE= option in the REPEATED statement. 

 

For examples of using alternating logistic regression for binary response data, you can see Example 48.6 in the PROC GENMOD documentation or Example 47.4 in the PROC GEE documentation. Example 47.5 in the PROC GEE documentation provides an example of alternating logistic regression for ordinal response data. 

 

For more information about alternating logistic regression, you can also refer to the "Alternating Logistic Regression" section in the documentation for either PROC GEE or PROC GENMOD. 

 

 

 

 

 

 

 

Minhtrang
Obsidian | Level 7

Dear Michaeil_SAS,

 

Thank you very much for your clear explanation!

 

Only one more question: If the outcome variable follow Poisson distribution, and I want to calculate Risk Ratio instead of Odds Ratio, are there any methods similar to the so-called "Alternating logistic regression" for Poisson distribution? Could you suggest a reference for  this issue?

 

 

MichaelL_SAS
SAS Employee

Glad to help.

 

To your follow up question, an odds ratio estimate, for example as computed by the LSMEANS statement, is typically used to assess the effect of a predictor from the response model on the outcome on the odds ratio scale. Alternating logistic regression (ALR) on the other hand uses a log odds ratio model to model the association between observations within a cluster instead of estimating the correlations for a given working correlation structure. As described in the “Alternating Logistic Regression” section of the documentation, the log odds ratio model estimates regression parameters for coefficients that are fixed based on the structure you have specified by using the LOGOR= option. I am not aware of any similar method like ALR for data with a Poisson response, and am not sure if one would be necessary.

 

If you are looking to use a modified Poisson approach to estimate the relative risk, associated with a predictor in the response model, for binary response data you might look at the discussion of Zou’s method at the end of this Usage Note and the references it provides.  

Minhtrang
Obsidian | Level 7

Thank you very much for your great help!

 

I'm still not clear about the so-called "correlation coefficient boundaries".

I did not hear about or see  this term in books or documents on the topic of repeated measure analysis.

Is it similar to correlation matrix?

 

I was asked by a reviewer. I'm not sure if he asks the right question.

Minhtrang
Obsidian | Level 7

Hi Ksharp,

 

Thank you for your suggestion.

 

Could you help me a bit more by explaining the term "correlation coefficient boundaries" in repeated measurement analysis? Is it similar to correlation matrix? For binary data, is it necessary to check for ""correlation coefficient boundaries"?

 

I did not see this term "correlation coefficient boundaries" in books or other documents on the topic of repeated measurement analysis or longitudinal data analysis.

I was asked by a reviewer, and I'm not sure if he asks the right question.

 

Thank you once again!

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1076 views
  • 5 likes
  • 3 in conversation