BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Please help me. I don’t know how I should set up my data.
This problem involves a longitudinal dataset with up to 20 measurements per subject.

My outcome variable is the difference between “variable Y” at time 20 and “variable Y” at time 1. My exposure of interest does not vary with time (i.e.: sex), but some of the covariates to be included in the model do change over time (i.e.: stress and time).

I have used a dataset with multiple lines per subject (one line per measurement period). In this dataset, I have created an outcome variable (change_in_Y) that represents the difference between “variable Y” at time 20 and “variable Y” at time 1. Therefore, for a given subject, the value of this variable does not change from one line to another.

This doesn’t seem right to me. How should I rearrange my dataset or outcome variable?

Here is an example of the syntax I am using:
PROC GENMOD data=A;
CLASS id ;
MODEL change_in_Y = gender stress time /dist=normal;
REPEATED sub=id/type=CS corrw;
RUN;
QUIT;
3 REPLIES 3
djmangen
Obsidian | Level 7
Off hand -- and I'd need to know much more about your research problem before I'd strongly recommend this approach, I suspect that your data set might be fine. I wonder, however, whether you'd be better off modeling Y -- not the gross aggregate change in Y from T1 through T20.

Also, how severe are your missing data problems. Could you formulate this question any differently if you restricted the analysis to those with complete data on all 20 periods of measurement? That would give you the opportunity to look at this as a Markov process.
plf515
Lapis Lazuli | Level 10
Why are you throwing away most of your data?

You've measured something at many time points and are using only two. It's hard to see how that could be good.

I think you might want to be using PROC GLIMMIX, and using all your data.
deleted_user
Not applicable
Thank you for your replies. I should have mentionned that variable Y was only measured at time 1 and time 20. The exposure and other covariates were measured up to 20 times per subject.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 1805 views
  • 0 likes
  • 3 in conversation