Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Regression

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 01-27-2022 09:47 AM
(766 views)

If i have a list of 5 variable and need to derive all possible combinations for these 50 variable where no two variables are highly correlated with one another,given that i have the correlation between every two variables ( every possible pir of variables) from these 5o variables

For example i need output like highest number of combination (i.e 20-30 etc from these variables )

For example i need output like highest number of combination (i.e 20-30 etc from these variables )

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@AmrAd wrote:

If i have a list of 5 variable and need to derive all possible combinations for these 50 variable where no two variables are highly correlated with one another,given that i have the correlation between every two variables ( every possible pir of variables) from these 5o variables

For example i need output like highest number of combination (i.e 20-30 etc from these variables )

So, this is a set of requirements that I haven't seen before. You can do all possible regressions via this code, and then weed out the ones you don't want based upon your correlation restrictions.

I wonder though, if there isn't a better way to handle multicollinearity. Actually, I don't wonder, I know there are better ways to handle multicollinearity, which also involve much less programming. The two that come to mind are PROC GLMSELECT and PROC PLS, both of which give you the ability to fit models (and compare them) in the presence of multicollienarity. If I were you, I would start there and not even bother with the method you stated.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I am running proc logistic on these variables and the model selects up to 20-30 variables from this list. However, Variance inflation factor breaches the acceptable threshold and shows multi-collinearity in up to 10 of them. So i am looking for a code that could help me in testing out all possible combinations of these 10 with respect to the non-collinear remaining 20 or so.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You can simply modify the code I linked to so that instead of PROC REG, you type in PROC LOGISTIC and the desired options.

You can also find out there on the internet SAS code for stepwise PROC GLIMMIX, which would include logistic regression as a special case.

I still think the best method of handling multicollinearity is not what you are trying to do, but Logistic Partial Least Squares. Unfortunately, this is not a feature of SAS, but it has been programmed in R. There is no logistic counterpart to PROC GLMSELECT.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Thanks a lot will check this

Are you ready for the spotlight? We're accepting content ideas for **SAS Innovate 2025** to be held May 6-9 in Orlando, FL. The call is **open **until September 16. Read more here about **why** you should contribute and **what is in it** for you!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.