BookmarkSubscribeRSS Feed
bncoxuk
Obsidian | Level 7

Hi,

I used the PROC LOGISTIC to run a logit model. There is an independent variable called 'age' which has the levels as 1, 2, 3, 4. By default, level=4 acts as the reference group

The results showed that the parameter estimates for levels 1 and 2 are reasonable, but the estimate for level 3 is exactly 0.000 (same as level 4). I tried again by incorporating 3 into level 2, and ran the model again. But this time the estimate for level 2 also became 0.

There must be some problem with the data. Why? Please help.

6 REPLIES 6
Reeza
Super User

What does a proc freq give you with the dependent vs independend for age?

bncoxuk
Obsidian | Level 7

There are two variables, age(1, 2, 3, 4) and income(1, 2, 3). Both are categorical.

if age=4 then income is always 3. That is, these two levels are perfectly correlated.

The results showed that income can be predicted nicely with estimates for level 1, 2 (level 3: 0). But for age, both levels 3 and 4 have estimates 0.000.

I don't know if this caused the problem. But age and income are not 100% correlated. They are only perfectly correlated when age=4 and income=3.

Reeza
Super User

If 4 is your reference level you shouldn't have an estimate for it, not sure where/why you assume it's 0.

How many observations are you working with?

Is it possible that there could be a correlation with another variable in the model?

Have you tried a different reference level?

If you grouped the 3/4 rather 2/3 what happens?

Does the log or output say anything about quasi-seperation?

There's a lot of possibilities, without access to the data/code its hard to say...

As Art suggested, post your code...

bncoxuk
Obsidian | Level 7

Yeah, it is pretty hard to describle. I will try to think for a while.

Thanks.

art297
Opal | Level 21

Post your code.  You may not have correctly submitted what you intended to submit

bncoxuk
Obsidian | Level 7

There are two variables, age(1, 2, 3, 4) and income(1, 2, 3). Both are categorical.

if age=4 then income is always 3. That is, these two levels are perfectly correlated.

The results showed that income can be predicted nicely with estimates for level 1, 2 (level 3: 0). But for age, both levels 3 and 4 have estimates 0.000.

I don't know if this caused the problem. But age and income are not 100% correlated. They are only perfectly correlated when age=4 and income=3.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1348 views
  • 0 likes
  • 3 in conversation