BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
dwhitney
Calcite | Level 5

I need help identifying an appropriate statistical methodology and the corresponding SAS procedure for an analysis.

 

The research background is that adults with a specific type of disability have higher 1-3 year rates of various morbidities and mortality following a fracture event as compared to both (1) adults with this disability that did not fracture and (2) the general population without this specific type of disability that also sustained a fracture.

 

The current study seeks to understand longer-term trajectories of accumulating comorbidities and to identify potential inflection points along a 10-year follow-up, which may inform when intervention is critical to minimize "overall health" declines (comorbidity index will be used as a proxy measure of "overall health").

 

The primary exposure is the cohort variable which will have 4 groups, people with a specific type of disability (SD) and without SD (w/oSD), and those that experienced an incident fracture (FX) and those that did not (w/oFX): (1) SD+FX, (2) SDw/oFX, (3) w/oSD+FX, (4) w/oSDw/oFX. The primary group of interest is SD+FX, where the other three are comparators that bring different value to interpretations.

 

The outcome is the count value of a comorbidity index (CI). The CI has a possible range from 0-27 (i.e., 27 comorbidities make up this CI and presence of each comorbidity provides a value of 1), but the range in the data is more like 0-17, highly skewed and a hefty amount of 0's (proportion with 0's ranges from 20-50% of the group, depending on the group). The comorbidities include chronic conditions and acute conditions that can recur (e.g., pneumonia). I have coded this such that once a chronic condition is flagged, it is "carried forward" and flagged for all later months. Acute conditions have certain criteria to count as distinct events across months.

 

I have estimated each person's CI value at the month-level from 2-years prior to the start of follow-up (i.e., day 0) up to 10-years after follow-up. There is considerable drop out over the 10-years, but this is not surprising and sensitivity analyses will be planned.

 

I have tried interrupted time series (ITS) and ARIMA, but these models don't seem to handle count data and zero-inflated data...? Also, I suspect auto-correlation and its impact on SE given the monthly assessment, but since everyone's day 0 is different, "seasonality" does not seem to be relevant (I may not fully understand this assumption with ITS and ARIMA).

 

Growth mixture models don't seem to work because I already have my cohorts that I want to compare.

 

Is there another technique that allows me to compare the monthly trajectory up to 10-years between the groups, given that the (1) outcome is a count variable and (2) the outcome is auto-correlated?

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

CATMOD does not fit GEE models. The recommended procedure is PROC GEE though it can also be done using PROC GENMOD which is what the Margins macro uses.

 

See this note which uses a GEE modeling approach to interrupted time series and allows for nonnormal response distributions like the negative binomial distribution. The negative binomial GEE model can handle both the repeated measures nature of your data and overdispersed count data such as caused by excessive zero counts. The note shows examples of studies with one group or more and with normal and binary response data. While an example with count data is not shown, it would be done essentially like the binary response examples but with selection of a different distribution and link function. Links at the beginning of this note point to related notes that might be of interest, particularly the one about spline effects which might be useful to you if your trajectories over time are not well approximated by simple linear or polynomial effects in the model. The last example in the note would be close to the study design that you mention.

View solution in original post

2 REPLIES 2
StatDave
SAS Super FREQ

CATMOD does not fit GEE models. The recommended procedure is PROC GEE though it can also be done using PROC GENMOD which is what the Margins macro uses.

 

See this note which uses a GEE modeling approach to interrupted time series and allows for nonnormal response distributions like the negative binomial distribution. The negative binomial GEE model can handle both the repeated measures nature of your data and overdispersed count data such as caused by excessive zero counts. The note shows examples of studies with one group or more and with normal and binary response data. While an example with count data is not shown, it would be done essentially like the binary response examples but with selection of a different distribution and link function. Links at the beginning of this note point to related notes that might be of interest, particularly the one about spline effects which might be useful to you if your trajectories over time are not well approximated by simple linear or polynomial effects in the model. The last example in the note would be close to the study design that you mention.

dwhitney
Calcite | Level 5

My apologies for the delay. This worked out really well! Thank you kindly for the very informative post and links!!

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 395 views
  • 1 like
  • 2 in conversation