Not applicable
Posts: 1

# Binary logit model

Hi Everyone!

I am working on crash data. My dependent variable is dichotomous, and the independent variables have categories. But I created a new variable for each category within a  variable. For example if there are 3 vehicle types (CAR, BUS, TRUCK), and vehicle type is my independent variable. I created  three variables named car, bus and truck and if the vehicle involved in the crash is car then I assigned numeric '1' in that row and other two variables have zeros. After converting all sub categories into variables, total I have 85 variables. I have used proc qlim to develop a binary logit model, between my dependent variable (fatality is there or not) and all my dichotomous independent variables, I am always have missing standard errors, t-statistics values in the output. It is also showing an error message of hessian matrix is singular. I could not figure this out. Please help me. Thanks in advance. I am attaching a couple of rows and columns for reference.

 fatnot ascfatal ssr ls pva pr rpa rma rmac rmic rl upa uma uc ul ol tl thl
 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0
Posts: 2,655

## Re: Binary logit model

You could make this a lot easier on yourself by employing the CLASS statement.  To carry your example further, suppose the variable VEHICLE had three types (CAR BUS TRUCK).  Some skeleton code would look like:

proc qlim;

class vehicle;

model fatnot=vehicle;

run;

This should help immensely with the problem of singular matices and missing standard errors.

Steve Denham

Regular Contributor
Posts: 151

## Re: Binary logit model

If you stick with your binary variable coding (instead of using the class statement as per Steve's suggestion) then you need to make sure that you always omit one category from your regression. So for your example, you would omit the variable TRUCK from your regression (or one of CAR or BUS - it doesn't matter). Assuming you then have a model with constant, CAR and BUS as predictors, the predicted value for TRUCK is (a transformation of) the constant, the predicted value for CAR based on the constant plus the car parameter and so on.

Super User
Posts: 23,700

## Re: Binary logit model

The technical term for this is parameter over specification. Put more simply if you have 3 levels you only need two variables to identify which level an observation belongs to, the third variable is extraneous and will have the missing values/std. errors.

Discussion stats
• 3 replies
• 304 views
• 2 likes
• 4 in conversation