Compare Binomial Logistic Regression with Multinational Logistic Regre...

udithagalgamuwa · Posted 09-26-2017 01:58 AM

Hi,

I am developing some logistic regression models to evaluate the safety effectiveness of roadway countermeasures. What I am really interested is not the predictions but to findout the effect of explanatory variable towards the response variable (Crashes). I can model crashes as the response variable and other factors as the explanatory variables using either binomial or multinomial logistic regression models.

What I need to know is how to find the better model (multinomial or binomial) which is most suitable for the given dataset?

Thank you

Reeza · Posted 09-26-2017 02:51 AM

Do you only have two outcome variables? AUC or ROC and VIF may be of interest.

If the outcome is a count, wouldn't this be a Poisson regression?

udithagalgamuwa · Posted 09-26-2017 10:07 AM

Hi, Reeza,

Thank you so much for your quick reply.

Yes, it is a count and Poisson models or Generalized Mixed Models would be one of the approaches to predict the count. But in my area of research, we use logistic regression to develop case-control models. Following I provide the brief description of my data and how we analyze them.

Original dataset (This is used to develop multinomial logistic regression models)

Road Segment	Number of Crashes	Daily Traffic (number of vehicles)	Segment Length (miles)
1	0	100	0.5
2	0	1050	0.8
3	1	2100	1.2
4	2	2500	2.1
5	0	950	0.6
6	3	3000	2.3
7	2	2400	1.8
8	1	1800	1.1
9	0	1150	1
10	4	4500	2.9
11	5	4600	2.9
12	0	800	0.35

For the binomial logistic regression, we assign 1 for the crash segments which are not 0 as follows

Road Segment	Number of Crashes	Daily Traffic (number of vehicles)	Segment Length (miles)
1	0	100	0.5
2	0	1050	0.8
3	1	2100	1.2
4	1	2500	2.1
5	0	950	0.6
6	1	3000	2.3
7	1	2400	1.8
8	1	1800	1.1
9	0	1150	1
10	1	4500	2.9
11	1	4600	2.9
12	0	800	0.35

What I need to know is after we develop models using these two methods, how to find the model which has the better fit toward the dataset.

Thanks

StatDave · Posted 09-29-2017 11:01 AM

If that is the entire set of data available, then I don't think you will be able to fit multinomial (whether nominal or ordinal - you didn't specify) since the data are just too sparse. Even with the binary version of the response there is sparseness problems for a model with just those two predictors (vehicles, miles). In that case, the FIRTH option can be used to use a penalized likelihood resulting in finite parameter estimates (and both predictors are nonsignificant). A Poisson model can be fit in GENMOD - again, both predictors nonsignificant. If you have more data so that you can successfully fit the various models of interest to the number of crashes response, then you could use the Vuong test to compare pairs of strictly nonnested models.

udithagalgamuwa · Posted 09-29-2017 01:01 PM

Thank you for your valuable suggestion.

This is an Ordinal dataset. Here I have mentioned only the fraction of my dataset. In the original dataset, I have more than 20,000 data rows. I have developed both binomial and multinomial regression models using "proc logistic" and the both models are significant. But I need to select one model which has a better prediction power and which is most suitable for the given dataset.

I cannot use Vuong test because this test requires that both models are fit using exactly the same set of response values. However, my response variables are not the same.

StatDave · Posted 09-29-2017 01:09 PM

The Vuong test does not require different response variables, it just requires that the models be nonnested. In fact, the VUONG macro requires the models being compared to have the same response.

Reeza · Posted 09-29-2017 10:09 PM

AUC - area under the curve is considered one measure that's suitable to comparing the accuracy of the model.

For a 2x2 table you can also look at the specificity, sensitivity measures.

If there's any clinical significance looking at the numbers needed to treat to detect is also a good measures.

udithagalgamuwa · Posted 10-01-2017 10:47 AM

HI, Reeza,

Thank you for your valuable suggestion.

I have used 2*2 tables to calculate specificity, sensitivity, and accuracy of the binary logistic regression models. But I am not sure how to use them for the multinomial regression models (my aim is to compare binomial model with multinomial regression model)

I have looked into AUC, do you have any suggestions where I can find more details of how to use that method in SAS/STAT basic version. Is ROC the same as AUC?

Thankx

StatDave · Posted 10-01-2017 12:51 PM

ROC analysis (producing an ROC curve and computing the AUC - the area underneath the curve) applies only to a binary model, not the multinomial.

udithagalgamuwa · Posted 10-02-2017 09:23 AM

Do you have any suggestions of comparing binomial logistic regression models with multinomial logistic regression models and to find which one is the better?

Thanks

Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression

Re: Compare Binomial Logistic Regression with Multinational Logistic Regression