BookmarkSubscribeRSS Feed
Robo
Calcite | Level 5

Hi Everyone!

I am working on crash data. My dependent variable is dichotomous, and the independent variables have categories. But I created a new variable for each category within a  variable. For example if there are 3 vehicle types (CAR, BUS, TRUCK), and vehicle type is my independent variable. I created  three variables named car, bus and truck and if the vehicle involved in the crash is car then I assigned numeric '1' in that row and other two variables have zeros. After converting all sub categories into variables, total I have 85 variables. I have used proc qlim to develop a binary logit model, between my dependent variable (fatality is there or not) and all my dichotomous independent variables, I am always have missing standard errors, t-statistics values in the output. It is also showing an error message of hessian matrix is singular. I could not figure this out. Please help me. Thanks in advance. I am attaching a couple of rows and columns for reference.

fatnotascfatalssrlspvaprrparmarmacrmicrlupaumauculoltlthl
011000000010000010
011000000010000010
111000000100000010
011000000010000010
111000000000100010
011000000000001010
3 REPLIES 3
SteveDenham
Jade | Level 19

You could make this a lot easier on yourself by employing the CLASS statement.  To carry your example further, suppose the variable VEHICLE had three types (CAR BUS TRUCK).  Some skeleton code would look like:

proc qlim;

class vehicle;

model fatnot=vehicle;

run;

This should help immensely with the problem of singular matices and missing standard errors.

Steve Denham

BruceBrad
Lapis Lazuli | Level 10

If you stick with your binary variable coding (instead of using the class statement as per Steve's suggestion) then you need to make sure that you always omit one category from your regression. So for your example, you would omit the variable TRUCK from your regression (or one of CAR or BUS - it doesn't matter). Assuming you then have a model with constant, CAR and BUS as predictors, the predicted value for TRUCK is (a transformation of) the constant, the predicted value for CAR based on the constant plus the car parameter and so on.

Reeza
Super User

The technical term for this is parameter over specification. Put more simply if you have 3 levels you only need two variables to identify which level an observation belongs to, the third variable is extraneous and will have the missing values/std. errors.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 1066 views
  • 2 likes
  • 4 in conversation