Generalized estimating equation and wide data format

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 75
Accepted Solution

Generalized estimating equation and wide data format

I first summarize the analysis of some papers that I want to imitate. The regression looks like:

Y2(subjects) = Y2(friends) + Y1(subjects) + Y1(friends) + Z(subjects)

Where 'subjects' refers to focal people and 'friends' refers to those subjects connect to. Some one may be a subject in a relationship and a friend in another.

The idea is to use the outcome variable of friends at time 2, Y2(friends), to explain outcome variable of subjects at that time (Y2(subjects)), controlling for outcomes of both subjects and friends at time 1. Other covariates of subjects may be included, but it's not the main consideration.

Following this model, I think the data should be in the wide format, which looks like:

SubjID     FrID     Y1Subj     Y1Fr     Y2Subj     Y2Fr     Z

The authors apply the generalized estimating equation (GEE) in their analysis. However, I don't know how to perform GEE with the wide data format (let's say PROC GENMOD), if that's the case. Can you tell if I'm missing some thing (e.g., reconstruction of data in some way so that GEE can be used)?


Accepted Solutions
Solution
‎07-29-2013 03:15 PM
Respected Advisor
Posts: 2,655

Re: Generalized estimating equation and wide data format

If Y2Subj is the response variable, and you have unique records for each SubjID-FrID combination, such that Y1Subj, Y1Fr, Y2Fr and Z are predictors and are NOT repeated for any SubjID-FrID combination, then you are ready to start modeling using GEE in GENMOD.  For each subjid, you have some number of repeated measures, indexed by FrID.. From there, it shouldn't be too difficult to write a model statement and a repeated statement.

Steve Denham

View solution in original post


All Replies
Solution
‎07-29-2013 03:15 PM
Respected Advisor
Posts: 2,655

Re: Generalized estimating equation and wide data format

If Y2Subj is the response variable, and you have unique records for each SubjID-FrID combination, such that Y1Subj, Y1Fr, Y2Fr and Z are predictors and are NOT repeated for any SubjID-FrID combination, then you are ready to start modeling using GEE in GENMOD.  For each subjid, you have some number of repeated measures, indexed by FrID.. From there, it shouldn't be too difficult to write a model statement and a repeated statement.

Steve Denham

Frequent Contributor
Posts: 75

Re: Generalized estimating equation and wide data format

So it appears that the repeated measures here are not on the time dimension (e.g., time 1, time 2) but on the pair dimension. That's what I was thinking of but not sure.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 199 views
  • 0 likes
  • 2 in conversation