remove levels of nominal variables

I am getting out of memory exceptions when I run PROC GLMSELECT with lasso selection. I presume this is because our infrastructure is pretty weak.


I would like to explore other automatic feature selection methods such as PCA, IV, WOE etc.The situation is that I have several nominal features with several levels.For simplicty lets assume I have 2 features with 2 and 3 levels respectively: F1 = {L1, L2} and F2 = {L1, L2, L3}.


In PROC GLMSELECT I just use the class option, which sorts out the dummy encoding. However, as indicated above I would like to remove levels/dummy features to avoid overfitting. Is this possible? Any pointers would be very much appreciated. Thanks!

