Re: setup data for study of change over time

deleted_user · Posted 01-18-2010 09:48 AM

Please help me. I don’t know how I should set up my data.
This problem involves a longitudinal dataset with up to 20 measurements per subject.

My outcome variable is the difference between “variable Y” at time 20 and “variable Y” at time 1. My exposure of interest does not vary with time (i.e.: sex), but some of the covariates to be included in the model do change over time (i.e.: stress and time).

I have used a dataset with multiple lines per subject (one line per measurement period). In this dataset, I have created an outcome variable (change_in_Y) that represents the difference between “variable Y” at time 20 and “variable Y” at time 1. Therefore, for a given subject, the value of this variable does not change from one line to another.

This doesn’t seem right to me. How should I rearrange my dataset or outcome variable?

Here is an example of the syntax I am using:
PROC GENMOD data=A;
CLASS id ;
MODEL change_in_Y = gender stress time /dist=normal;
REPEATED sub=id/type=CS corrw;
RUN;
QUIT;

djmangen · Posted 01-19-2010 05:48 PM

Off hand -- and I'd need to know much more about your research problem before I'd strongly recommend this approach, I suspect that your data set might be fine. I wonder, however, whether you'd be better off modeling Y -- not the gross aggregate change in Y from T1 through T20.

Also, how severe are your missing data problems. Could you formulate this question any differently if you restricted the analysis to those with complete data on all 20 periods of measurement? That would give you the opportunity to look at this as a Markov process.

plf515 · Posted 01-20-2010 07:55 AM

Why are you throwing away most of your data?

You've measured something at many time points and are using only two. It's hard to see how that could be good.

I think you might want to be using PROC GLIMMIX, and using all your data.

deleted_user · Posted 01-20-2010 03:08 PM

Thank you for your replies. I should have mentionned that variable Y was only measured at time 1 and time 20. The exposure and other covariates were measured up to 20 times per subject.