BookmarkSubscribeRSS Feed
jbsas
Calcite | Level 5
I have a study with repeated measures exposure (x measured once daily for several days) and a single fixed outcome at the end of the study (Y, event status at day 30). What would be the appropriate modeling approach? Normally, I would use GEE with PROC GENMOD, but the response variable is a one-time measurement at end of study and not a repeated measure. Only the predictors are repeated measure. 
 
proc genmod desc;
CLASS subject day; 
MODEL y = x / DIST=BIN LINK=logit;
REPEATED SUBJECT=subject / WITHINSUBJECT=day TYPE=UN; run;
 
In the above program, would it be valid for me to copy Y to all study days for each subject, even though it is a cross-sectional measure taken at end of study? The program runs if I do that, but is that valid for this unique case?
 
Regarding alternative models, my field discourages Cox models because time-to-event is viewed as not relevant for such short follow-up periods. Plus, there's a time-gap between the last measured X and the later-assessed cross-sectional outcome Y.
 
Any help would be greatly appreciated!
2 REPLIES 2
SteveDenham
Jade | Level 19

What about reshaping the data so that you have x1 to x30 (I presume) as predictors, and then doing something like a logistic regression where you used LASSO or elastic net to select the variables that have the greatest influence?  I like that better than putting in a single response and doing univariate things.  Another possibility would be to use an EFFECT statement to fit a spline to the x variables, and then do the regression on the spline variable.  This avoids the dangers of variable selection, but the trade-off is in knot selection.

 

SteveDenham

SteveDenham
Jade | Level 19

Thought about this more overnight. This sounds like a job for PROC CANDISC.  From the documentation:

Given a classification variable and several quantitative variables, the CANDISC procedure derives canonical variables, which are linear combinations of the quantitative variables that summarize between-class variation in much the same way that principal components summarize total variation.

 

The example in the documentation ought to point you in the right direction, and the graphic generated should help with interpretation. Since the canonical variables are constructed based on both within and between associations, any time dependency in your X variables should be accounted for (i.e, they don't have to be independent variables).

 

SteveDenham

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 400 views
  • 0 likes
  • 2 in conversation