turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- ordinal response and ordinal independent variables

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-23-2017 04:22 AM

I have an ordinal independent variable and ordinal response variable. I used PRoc logistic and checked score test for proportional odds. It did not hold true. Therefore I resorted to Generalized logit. But it gives me the following warning:

The validity of the model fit is questionable.

And the log says the following:

There is possibly a quasi-complete separation of data points. The maximum likelihood

estimate may not exist.

WARNING: The LOGISTIC procedure continues in spite of the above warning. Results shown are based

on the last maximum likelihood iteration. The validity of the model fit is questionable.

What would be an appropriate way to get an outcome? Or is there any way by which i can eliminate the above errors?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Arushiarora0

07-23-2017 08:33 AM

For background info on (quasi-)complete separation, see:

Usage Note 22599: Understanding and correcting complete or quasi-complete separation problems

http://support.sas.com/kb/22/599.html

But even when you have a separation condition, the resulting model can be quite good at classifying observations. Check this on a holdout dataset! Holdout dataset = independent observations with known outcome but never seen by the model while training it.

However when you have a separation condition, the resulting model cannot be interpreted. Inference about regression coefficients and odds ratios should be avoided, because maximum likelihood estimates for the model parameters do not exist. You simply treat the model as if it is produced by an uninterpretable machine learning algorithm (like neural nets).

What can you do to avoid the separation condition?

Collapsing levels of categorical variables and binning interval variables are commonly used techniques to deal with separation condition.

Good luck,

Koen

Brussels