## Multinomial Regression

Occasional Contributor
Posts: 5

# Multinomial Regression

I have large dataset with more than 50,000 rows. My outcome variable is a nominal variable with 4 outcomes, of which 3 are ordinal and the 4th is independent of the other 3. My predictors list contains more than 100 variables, of which some are categorical and some are continuous. The categorical predictors are not ordinal.

I researched PROC GENMOD which allows to predict multinomial regression but doesnt have variable select method. I also looked up PROC Logistic which can predict oridnal outcome variable and has variable selection option.

Posts: 5,046

## Multinomial Regression

"My outcome variable is a nominal variable with 4 outcomes, of which 3 are ordinal and the 4th is independent of the other 3." - Please explain further.

PG

PG
Occasional Contributor
Posts: 5

## Re: Multinomial Regression

PG,

The outcome variable is a product type of which 3 products are is ordinal in terms of risk ratings. The 4th is a different type of product. For this discussion we can consider two scenarios.

1. All are different

2. All are ordinal

Depending on both solutions, I can then furture analyze and customize the procedure to my needs. Hope this helps.

Posts: 5,046

## Multinomial Regression

You mean something like possible outcomes are ( 1, 2, 3, A )?

Interesting. This seems like the type of analysis that marketing experts do.

CART modelling (a data mining tool) would seem like a good place to start.

PG

PG
SAS Super FREQ
Posts: 3,837

## Multinomial Regression

Outcomes that are {1,2,3,A}?  That is a new concept for me.

Someone who actually understands your model (not me!) might have a better way to do it, but to get the ball rolling, would it be possible to run this model in two stages?

1) Subset the data to WHERE(Y^=A). For this subset, use an ordinal regression model.

2) For the full data, create a new indicator variable isA = (y=A).  Build a logistic regression for this model.

For scoring new data, you would use the logistic model to predict whether the data is 'A' or 'Not A.'  If it is 'Not A', then use the ordinal model to predict whether the response is 1,2, or 3.

I am interested in hearing from statistical gurus like PG whether it has any merit or whether this approach should be avoided. As I said, I basically "made it up" on the spot.

Rick

Occasional Contributor
Posts: 5

## Re: Multinomial Regression

Hi Rick,

"

1) Subset the data to WHERE(Y^=A). For this subset, use an ordinal regression model.

"

Based on my dataset which procedure do you recommend and with what options? I want to have variable selection as well.

Thanks.

SAS Super FREQ
Posts: 3,837