A selection algorithm would be a great feature to have in GENMOD. Although automatic selection methods are controversial in some instances, in some cases all one needs is a reasonable good-enough model with some of the noise removed. It would also be great to be able to obtain such model within a reasonable time and without too much programming. In absence of the repeated measures, you could conduct the analysis in R, using the step() function. This function finds a model that minimizes either AIC or BIC, using a backward, forward, or stepwise (both backward and forward) searches. The function should work with models of the following families: binomial, gaussian, Gamma, inverse.gaussian, poisson, quasibinomial, quasipoisson. The quasibinomial and quasipoisson families are the over-dispersed versions of the binomial and poisson, respectively. However, the situation is even more complex when you have repeated measures. As far as I know there are no readily available selection algorithms for generalized linear models with repeated measures. A couple of months ago, I was working on a similar problem, and all I could find was a couple of experimental R packages, and that's about it. Aside from the traditional stats methodology, there are some convoluted ways to approach the problem using data mining techniques, for instance: assuming that all subjects have similar number and timing for the repeated measures, you could conduct cluster analysis for the outcome and transform it into a categorical variable for trajectories (the clusters). Predictors that are time-dependent can also be transformed into trajectories. Then the transformed outcome, a nominal variable, can be used as dependent variable of a non-linear model such as a regression tree; the predictor selection is implicit in the tree-building algorithm. This is likely not implementable in SAS stat alone, as the clustering algorithms are there, but, as far as I know, the regression trees are not part of SAS stat, they are included in the SAS enterprise miner product. The approach can be attempted in R; however, regardless of the software, there are the issues of how many trajectories (clusters) to select, which is not a simple problem, and also what type of tree model to use, as there are many varieties (not sure which are available in SAS enterprise miner). For lack of simpler alternatives, I would suggest a quick-and-dirty approach, albeit imperfect and with risk of bias: in GENMOD you could begin by fixing the correlation structure to exchangeable, and then try a humble backward selection manually, one-at-a-time, using p-values and checking at what point the information criterion (QIC for GEE in GENMOD) is minimized in the backward selection sequence. Select the set of predictors that minimize QIC. Just and idea.
... View more