BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Dear all, I'm having some trouble using CATMOD instead of Proc Logistic. Normally I would use Proc Logistic, but it appears that it crashes when the amount of variables grows to large, meanwhile CATMOD remains stable and fast. So I must use CATMOD instead.

Now I have the following code:
---

PROC LOGISTIC DATA=My_Data
PLOTS(ONLY)=ALL

;
CLASS GNDR (PARAM=EFFECT) BLGETMG (PARAM=EFFECT) EDULVL (PARAM=EFFECT) EDULVLM (PARAM=EFFECT) EDULVLF (PARAM=EFFECT) EDCTN (PARAM=EFFECT) Head_Unemployed (PARAM=EFFECT) DSBLD (PARAM=EFFECT) RTRD (PARAM=EFFECT) Number_Of_Children_SEC (PARAM=EFFECT) HHMMB (PARAM=EFFECT);
WEIGHT Weight_Household_2009;
MODEL In_Poverty (Event = '1')= GNDR BLGETMG EDULVL EDULVLM EDULVLF EDCTN Head_Unemployed DSBLD RTRD Number_Of_Children_SEC HHMMB
/
LINK=LOGIT
ALPHA=0.10
;
RUN;

proc catmod data=My_Data;
response clogits;
model In_Poverty (Event = '1') = GNDR BLGETMG EDULVL EDULVLM EDULVLF EDCTN Head_Unemployed DSBLD RTRD Number_Of_Children_SEC HHMMB / alpha=0.10;
weight Weight_Household_2009;
run;
---

BUT running both of these Logstic Regression provides different estimates for my parameters and I don't understand why. The variables are exactly the same, and the link functions should match. But still they give me different results. Why?

- Julian.
3 REPLIES 3
deleted_user
Not applicable
Correction to the CATMOD code:
---
proc catmod data=My_Data;
response clogits;
model In_Poverty = GNDR BLGETMG EDULVL EDULVLM EDULVLF EDCTN Head_Unemployed DSBLD RTRD Number_Of_Children_SEC HHMMB / alpha=0.10;
weight Weight_Household_2009;
run;
---'

- Julian.
SteveDenham
Jade | Level 19
This is a guess. Repeat, this is only a guess. (Actually, three guesses).
1. PROC LOGISTIC and PROC CATMOD use very different parameterizations for class variables. Could this be the source of your differences?
2. A second guess is that PROC LOGISTIC uses a maximum likelihood algorithm and CATMOD a weighted least squares. That could also lead to differences.
3. Finally, it may be that you need to specify the class variables in a DIRECT statement in CATMOD. We are now beyond my experience level.

I hope at least one of these leads you to some resolution.

Steve Denham
deleted_user
Not applicable
Thank you for the fast reply Steve.

However I fixed the problem myself. Turns out Proc Logistic and Proc CATMOD are both parametrized the same way in my code, they both use maximum likelihood and the direct statement is only for continous variables in CATMOD.... BUT they optimize differently, and treat "bad variables" (estimates that go to infinity etc...) very differently. Once I removed all these bad variables the estimates matched to a very close numerical precision. It also appears that the numerical differences between CATMOD and Logistic approach zero as the number of distinct cases for each class variable increases.

I've posted my code again in case others run into the same problems:
---
PROC LOGISTIC DATA=SASUSER.FILTER_FOR_CON_DATA_FORMATT_0005
PLOTS(ONLY)=ALL

;
CLASS GNDR (PARAM=EFFECT) Head_Unemployed (PARAM=EFFECT) Number_Of_Children_SEC (PARAM=EFFECT) HHMMB (PARAM=EFFECT) ;
WEIGHT Weight_Household_2009;
MODEL In_Poverty (Event = '1')= GNDR Head_Unemployed Number_Of_Children_SEC HHMMB
/
LINK=logit
ALPHA=0.10
;
RUN;

proc catmod data=SASUSER.FILTER_FOR_CON_DATA_FORMATT_0005;
response clogits;
model In_Poverty = GNDR Head_Unemployed Number_Of_Children_SEC HHMMB / alpha=0.10;
weight Weight_Household_2009;
run;
---

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1089 views
  • 0 likes
  • 2 in conversation