Binary logit model

Robo · Posted 02-25-2015 08:17 PM

Hi Everyone!

I am working on crash data. My dependent variable is dichotomous, and the independent variables have categories. But I created a new variable for each category within a variable. For example if there are 3 vehicle types (CAR, BUS, TRUCK), and vehicle type is my independent variable. I created three variables named car, bus and truck and if the vehicle involved in the crash is car then I assigned numeric '1' in that row and other two variables have zeros. After converting all sub categories into variables, total I have 85 variables. I have used proc qlim to develop a binary logit model, between my dependent variable (fatality is there or not) and all my dichotomous independent variables, I am always have missing standard errors, t-statistics values in the output. It is also showing an error message of hessian matrix is singular. I could not figure this out. Please help me. Thanks in advance. I am attaching a couple of rows and columns for reference.

fatnot

ascfatal

ssr

ls

pva

pr

rpa

rma

rmac

rmic

rl

upa

uma

uc

ul

ol

tl

thl

0	1	1	0	1	0	0	1
0	1	1	0	1	0	0	1
1	1	1	1	0	0	0	1
0	1	1	0	1	0	0	1
1	1	1	0	0	1	0	1
0	1	1	0	0	0	1	1

SteveDenham · Posted 02-26-2015 08:25 AM

You could make this a lot easier on yourself by employing the CLASS statement. To carry your example further, suppose the variable VEHICLE had three types (CAR BUS TRUCK). Some skeleton code would look like:

proc qlim;

class vehicle;

model fatnot=vehicle;

run;

This should help immensely with the problem of singular matices and missing standard errors.

Steve Denham

BruceBrad · Posted 02-26-2015 05:35 PM

If you stick with your binary variable coding (instead of using the class statement as per Steve's suggestion) then you need to make sure that you always omit one category from your regression. So for your example, you would omit the variable TRUCK from your regression (or one of CAR or BUS - it doesn't matter). Assuming you then have a model with constant, CAR and BUS as predictors, the predicted value for TRUCK is (a transformation of) the constant, the predicted value for CAR based on the constant plus the car parameter and so on.

Reeza · Posted 02-26-2015 06:57 PM

The technical term for this is parameter over specification. Put more simply if you have 3 levels you only need two variables to identify which level an observation belongs to, the third variable is extraneous and will have the missing values/std. errors.

Binary logit model

Re: Binary logit model

Re: Binary logit model

Re: Binary logit model

Catch up on SAS Innovate 2026

0	1	1	0	1	0	0	1
0	1	1	0	1	0	0	1
1	1	1	1	0	0	0	1
0	1	1	0	1	0	0	1
1	1	1	0	0	1	0	1
0	1	1	0	0	0	1	1

0	1	1	0	1	0	0	1
0	1	1	0	1	0	0	1
1	1	1	1	0	0	0	1
0	1	1	0	1	0	0	1
1	1	1	0	0	1	0	1
0	1	1	0	0	0	1	1

Binary logit model

Re: Binary logit model

Re: Binary logit model

Re: Binary logit model

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away

0	1	1	0	1	0	0	1
0	1	1	0	1	0	0	1
1	1	1	1	0	0	0	1
0	1	1	0	1	0	0	1
1	1	1	0	0	1	0	1
0	1	1	0	0	0	1	1