Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Correlation between Dichotomous & Continuous / Nominal variabls: Proc ...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-29-2015 06:25 AM
(3426 views)

Hi,

In Logistics regression , we have a dishotomous dependent variable and continuous / nominal indeoendent variables. How do we assess the relationship between them for selecting the variables for the model?

Vishal

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You will mostly find everything there...

http://www.ats.ucla.edu/stat/sas/webbooks/reg/chapter3/sasreg3.htm

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi Vishal,

Whenever I built logistic regression models for dichotomous outcomes, I first performed univariate analyses for each independent variable. Basically, you can use PROC LOGISTIC to fit these univariable models. If the resulting p-value of a predictor (corresponding to the null hypothesis that the regression coefficient is zero) was less than some threshold value (typically 0.25), it would be a candidate for inclusion in multiple logistic regression later. Of course, important predictors from a content point of view (previous knowledge) should not be excluded just on the grounds of this purely statistical criterion.

For categorical independent variables (with k levels) a 2xk contingency table analysis with PROC FREQ will provide additional insight beyond the p-value (of the likelihood ratio chi-square test). For example, you can easily spot empty cells and sparse categories, which you may want to consolidate with other categories prior to further analysis.

After this preselection you can employ the built-in effect selection methods of PROC LOGISTIC (see SELECTION= option of the MODEL statement), in order to get down from, e.g., 20 to 5 model variables. Forward, backward, stepwise and best subset selection are available. If more than, say, 40 effects passed the preselection criteria, you may have to narrow these down by building preliminary multiple logistic regression models with manually selected subsets of independent variables. The results can give you hints as to which predictors should be filtered out, for example because they are almost redundant due to strong dependencies among the predictors.

You can find more details on variable selection, e.g., in Chapter 4 of Hosmer/Lemeshow: Applied Logistic Regression.

@pearsoninst: I think Vishal's question is about logistic regression with a dichotomous dependent variable, not linear regression with a continuous dependent variable.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Sorry Iam replying quite late.

I have a doubt here. In Linear regression we remove any collinearity between predictor by measuring VIF value of different predictors.

How is this step done in Logistics Regression. The Model that I am working on has WOE binned variables as preditors , so how the collinearity wil be checked. Is it that we can apply the same consept of VIF on underlying continuous variables (since collinearity cannot be measured betweeen Nominal variables) or we find out the Chi-Square Statistics between WOE variables.

Vishal

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

In Linear regression, SAS use Least Square Method to estimate the coefficient, so you can use VIF to check collinearity.

But in Logistic regression, SAS use Maximize Likelihood Method to estimate the coefficient. SAS will automatically check the

linear correlation between two variables,once SAS found collinearity ,SAS will set the coefficient of one of them to be zero .

**Available on demand!**

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.