Hi I am performing logistic regression and i am trying to reduce the continous variable for my model .
I have applied PRoc corr and applied the filter of corelation as x <-0.3 / x>0.3(conditional formatting) in excel. now how should i select the varibale for final model.
Refer to excel sheet attached . Please also share the process to reduce variable when you have lots of variables with corelation
Try using either PROC PRINCOMP or better yet PROC PLS to reduce the dimensionality of your predictor variables.
Don't use PROC CORR for this purpose.
@pathakvishal wrote:
But , Master Miller , it will also help in reducing collinearity as well .Also , PRoc PLS is better for reduction of continous variables ?
That's one of the major benefits of PROC PLS is that it provides better estimates of the model coefficients and better estimates of the predicted values (better in this case meaning lower mean squared error of the estimates) in the presence of collinearity among the predictor variables, compared to ordinary least squares regression.
Hi Master Miller ,
I hope we use PROC PLS before modelling (PROC Logistic) for continous varibale reduction . Also i am not able to find any good & easy article on PROC PLS() . If posssible , can you please share the link of any stuff like that. Thanx in advance !!!
Maybe we need to take a step back.
PLS does not reduce the number of original predictor variables. You let PLS determine which variables have high importance, and which have low importance, but they are all in the model. It uses ALL of them. You don't use PLS to select some to use, and discard the rest. This is different than what you may have learned about using PROC REG. This is a paradigm shift, and an important and valuable shift.
Also i am not able to find any good & easy article on PROC PLS() .
The documentation is a good place to start. Google finds plenty of introductory articles on Partial Least Squares.
Hi Master Miller ,
I am doing some case study for the first time , Logistic regression and i want to remove variables with high corelation for which you said USe proc PLS . I went through many articles about PROC PLS , but have some confusion ..
1 ) i was planning to do Proc corr(for continous variables) , remove varaibles with colinearity and then opt factor analysis to get most effective variable so that it can be used in logistic regression (Proc Logistic). My concern as you told me over thread to use PROC PLS so just want to know PLS is an step in place of PRoc Cor OR its a complete modelling step like proc Logistic .
2) if it is used in place PROC CORR , then do we have to use factor analysisin the next step or directly i can go ahead with PRoc Losgictic .
3) Which statement inside PROC PLS we must use to get desired variables (spare the dumbness) ?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.