turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Studio
- /
- VARIABLE/DIMENSION Reduction using proc corr

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-07-2017 09:06 AM

Hi I am performing logistic regression and i am trying to reduce the continous variable for my model .

I have applied PRoc corr and applied the filter of corelation as x <-0.3 / x>0.3(conditional formatting) in excel. now how should i select the varibale for final model.

**Refer to excel sheet attached . Please also share the process to reduce variable when you have lots of variables with corelation**

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-07-2017 09:10 AM - edited 04-07-2017 09:16 AM

Try using either PROC PRINCOMP or better yet PROC PLS to reduce the dimensionality of your predictor variables.

Don't use PROC CORR for this purpose.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-07-2017 09:32 AM

But , Master Miller , it will also help in reducing collinearity as well .Also , PRoc PLS is better for reduction of continous variables ?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-07-2017 10:01 AM

pathakvishal wrote:

But , Master Miller , it will also help in reducing collinearity as well .Also , PRoc PLS is better for reduction of continous variables ?

That's one of the major benefits of PROC PLS is that it provides better estimates of the model coefficients and better estimates of the predicted values (better in this case meaning lower mean squared error of the estimates) in the presence of collinearity among the predictor variables, compared to ordinary least squares regression.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-07-2017 03:05 PM

Hi Master Miller ,

I hope we use PROC PLS before modelling (PROC Logistic) for continous varibale reduction . Also i am not able to find any good & easy article on PROC PLS() . If posssible , can you please share the link of any stuff like that. Thanx in advance !!!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-07-2017 03:28 PM

Maybe we need to take a step back.

PLS does not reduce the number of original predictor variables. You let PLS determine which variables have high importance, and which have low importance, but they are all in the model. It uses ALL of them. You don't use PLS to select some to use, and discard the rest. This is different than what you may have learned about using PROC REG. This is a paradigm shift, and an important and valuable shift.

Also i am not able to find any good & easy article on PROC PLS() .

The documentation is a good place to start. Google finds plenty of introductory articles on Partial Least Squares.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-10-2017 02:31 PM

Hi Master Miller ,

I am doing some case study for the first time , **Logistic regression** and i want to remove variables with high corelation for which you said USe proc PLS . I went through many articles about PROC PLS , but have some confusion ..

1 ) i was planning to do Proc corr(for continous variables) , remove varaibles with colinearity and then opt factor analysis to get most effective variable so that it can be used in logistic regression **(Proc Logistic).** My concern as you told me over thread to use PROC PLS so just want to know PLS is an step in place of PRoc Cor OR its a complete modelling step like proc Logistic .

2) if it is used in place PROC CORR , then do we have to use factor analysisin the next step or directly i can go ahead with PRoc Losgictic .

3) Which statement inside PROC PLS we must use to get desired variables (spare the dumbness) ?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

04-10-2017 03:46 PM - edited 04-10-2017 03:52 PM

- PLS is a complete modelling method that accounts for the collinearity among your predictor variables
- No factor analysis, no PROC CORR, no logistic regression; if you use PROC PLS, you would need to create dummy variables of your responses to simulate a logistic regression model; it's not quite the same as logistic regression from a statistical point of view, but it does enable you to predict which category the data point belongs in. If you want an actual logistic version of Partial Least Squares, it is described in https://cedric.cnam.fr/fichiers/RC906.pdf, and there is an R package which appears to do logistic partial least squares regression at https://cran.r-project.org/web/packages/plsRglm/plsRglm.pdf. I am not aware of anyone programming this in SAS.
- PLS does not eliminate variables the way you keep asking. It uses ALL variables and assigns the one that are least important a loading value close to zero.