I'm looking to do a binary logit model on longitudinal data (clustered). I am unclear on whether to use proc logistic, proc genmod, proc catmod or proc surveylogistic. I'm aiming for the closest thing to STATA's methodology on clustered logit models. I'm unclear on the fine points of using these procedures-- would someone tell me the difference between these models, or if there's something else I'm missing?
If you have repeated measurements on subjects, then you'll probably want to use the REPEATED statement in PROC GENMOD. See the discussion and example in the GENMOD documentation. PROC LOGISTIC assumes that the observations are independent. PROC CATMOD can handle repeated measures data but requires complete data. That is, all subjects must be measured at all times. The Generalized Estimating Equations (GEE) method implemented by the REPEATED statement in GENMOD does not require complete data, allowing missing measurements. PROC SURVEYLOGISTIC is for the case of survey data such as when you sample within strata of population.