Hello;
I have a question related to proc mixed procedure. I have a model as the following
proc mixed data = result covtest method = reml; class student school course exam time; model res = time exam course time*exam time*course time*exam*course / s ; repeated /subject = student(exam) type = un; random school; where res ne .;
run;
Now, I'd like to adapt the code for the mixed model such that insignificant interactions are dropped 1 by 1 when p > 0.05. This will lead to 4 models in total. Any advice?
Regards
If it is just 4 models, I would do this one-at-a-time by manually removing the unwanted interactions from the code, and then running the code. If the real world problem has (for example) 50 interactions, you could write a macro to do this.
I need help to do that using macro.
Why? Are there really a lot more than 3 interactions?
In my mind, I can't possibly imagine what you are going to do after you run such a macro and now have 42 different models to compare.
The real problem that it seems you are dealing with is multicollinearity effects on the model estimates when using ML or REML estimation. I'm not sure what tools SAS provides to help with this. My guess is that you would need to run the FIXED part of the model through either PROC GLMSELECT or PROC PLS (or both) to determine how to handle the multicollinearity and selecting a model for the fixed part, then go back to PROC MIXED and perform the REML estimation (using the fixed model found). But I certainly would defer to others (perhaps @Rick_SAS @jiltao @SteveDenham @StatDave ) on how to handle multicollinearity for maximum likelihood estimation.
This sounds like model selection using a "backwards selection" method in which you start with the full model and then drop effects that are least significant. This is one possible way to select effects from a large set of candidates. If you do an internet search for
+sas variable selection mixed models
you will find many papers that use macros for variable selection in mixed models. I have not used any of them. However, I know the work of George Fernandez and Jorge Morel, so you might start with their papers:
@Rick_SAS wrote:
This sounds like model selection using a "backwards selection" method in which you start with the full model and then drop effects that are least significant. This is one possible way to select effects from a large set of candidates. If you do an internet search for
+sas variable selection mixed models
you will find many papers that use macros for variable selection in mixed models. I have not used any of them. However, I know the work of George Fernandez and Jorge Morel, so you might start with their papers:
- Fernandez (2007): https://support.sas.com/resources/papers/proceedings/proceedings/forum2007/191-2007.pdf
- Neerchal, Morel, et al. (2014) https://support.sas.com/resources/papers/proceedings14/1822-2014.pdf
I'm going to have to read these papers as well! Thanks @Rick_SAS !
Unfortunately, the actual macro code for these papers mentioned by @Rick_SAS doesn't seem to be available 😞
@PaigeMiller Maybe @gcjfernandez heard you. Regardless, see his recent post:
Thanks, @Rick_SAS !
I did see that, I haven't looked at the macro yet. I am always skeptical about using any form of stepwise, but from the diagram it seems as if this macro has many forms of pre-checking the variables before modeling and checking the resulting model which I want to understand.
Why do you want to remove them? I suppose that it might be because the data comes from an observational study rather than a designed experiment? Before doing this please read this paper
https://www.lexjansen.com/pnwsug/2008/DavidCassell-StoppingStepwise.pdf
Or Frank Harrell's Regression Modeling Strategies.
If you must reduce the fixed effects, try a LASSO method or elastic net - ignoring the random effects. But the best strategy is using prior knowledge of the system that generated your data to eliminate effects that are either a)known to be irrelevant (like fourth and fifth order interactions or b)are not of interest to your research question.
If you are looking for a purely predictive model, a classification and regression tree analysis may be what you need.
SteveDenham
LASSO helps you determine the proper predictive model. LASSO and Stepwise/Forward/Backward selection does not work with PROC MIXED, unless you adopt the methods from the papers linked to by @Rick_SAS . If you really want to do something with Stepwise/Forward/Backward selection (as your original question implies), then definitely read the paper linked by @SteveDenham. Lots of smart people have put in lots of work on this problem.
This paper "PLS Generalized Regression" also ought to work here, but as far as I know, there is no SAS code for it although there is an R-package that will perform this type of analysis.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.