... it would not make sense to keep them all since we would run into major multicollinearity.
This is NOT true if you use PROC PLS. PLS was specifically designed to work in the presence of multicollinearity; and specifically designed to keep all variables in the model (although many will be of no practical importance). Least squares regression has known problems in this case.
My final model should have between 4 and 10 independent variables.
My problem here is that specifying in advance how many independent variables might lead to trouble, if for example, there's only 1 significant predictor, or if there are 16 significant predictors.
What would be helpful is a function that would try all possible set of 4 independent variables (I have a total of 180 independent variable) and tell me which set is the most explanatory with the least amount of multicollinearity, heteroskedasticity and serial correlation.
In addition to my previously stated misgivings, I don't think there is a way to do this in SAS, other than by writing your own MACRO or PROC IML code. You could use one of the STEPWISE methods in PROC REG, and force the options START=4 and STOP=4 (I have never actually tried this, so I can't guarantee it will do what I think it will do), which will give you the best fitting 4 parameter model it can find in a sequential fashion (but does not iterate through all possible 4 parameter models), but as I said, the drawback is that ordinary least squares has known problems in the case of multicollinearity (and it doesn't try to minimize the multicollinearity or adjust for it in any way). It also assumes 4 is the right number, which it may not be. Which again leads me back to PROC PLS, which has none of these drawbacks
... View more