Cross-sectional response with longitudinal predictor

jbsas · Posted 06-22-2020 11:09 AM

I have a study with repeated measures exposure (x measured once daily for several days) and a single fixed outcome at the end of the study (Y, event status at day 30). What would be the appropriate modeling approach? Normally, I would use GEE with PROC GENMOD, but the response variable is a one-time measurement at end of study and not a repeated measure. Only the predictors are repeated measure.

proc genmod desc;

CLASS subject day;

MODEL y = x / DIST=BIN LINK=logit;

REPEATED SUBJECT=subject / WITHINSUBJECT=day TYPE=UN; run;

In the above program, would it be valid for me to copy Y to all study days for each subject, even though it is a cross-sectional measure taken at end of study? The program runs if I do that, but is that valid for this unique case?

Regarding alternative models, my field discourages Cox models because time-to-event is viewed as not relevant for such short follow-up periods. Plus, there's a time-gap between the last measured X and the later-assessed cross-sectional outcome Y.

Any help would be greatly appreciated!

SteveDenham · Posted 06-22-2020 11:30 AM

What about reshaping the data so that you have x1 to x30 (I presume) as predictors, and then doing something like a logistic regression where you used LASSO or elastic net to select the variables that have the greatest influence? I like that better than putting in a single response and doing univariate things. Another possibility would be to use an EFFECT statement to fit a spline to the x variables, and then do the regression on the spline variable. This avoids the dangers of variable selection, but the trade-off is in knot selection.

SteveDenham

SteveDenham · Posted 06-23-2020 07:41 AM

Thought about this more overnight. This sounds like a job for PROC CANDISC. From the documentation:

Given a classification variable and several quantitative variables, the CANDISC procedure derives canonical variables, which are linear combinations of the quantitative variables that summarize between-class variation in much the same way that principal components summarize total variation.

The example in the documentation ought to point you in the right direction, and the graphic generated should help with interpretation. Since the canonical variables are constructed based on both within and between associations, any time dependency in your X variables should be accounted for (i.e, they don't have to be independent variables).

SteveDenham

Cross-sectional response with longitudinal predictor

Re: Cross-sectional response with longitudinal predictor

Re: Cross-sectional response with longitudinal predictor

Cross-sectional response with longitudinal predictor

Re: Cross-sectional response with longitudinal predictor

Re: Cross-sectional response with longitudinal predictor

SAS Innovate 2025: Call for Content