BookmarkSubscribeRSS Feed
hellorc
Obsidian | Level 7

Hello,

 

Might someone please help me with a problem I encounter when I attempt to convert STATA logistic regression to SAS?

 

To keep the question simple, consider the logistic regression with variables:

response: binary 0/1

week: integer from 0 to 10

gender: M/F

age: continuous

 

In STATA, the code would be:

logit response i.week i.gender age , or

 

From the data, I know that for some weeks, let's say week=2, there is 0 observation with response=1. Running the above code in STATA will result in a warning like the following:

2.week != 0 predicts failure perfectly
2.week omitted and 24 obs not used

STATA will automatically handle "no event" or very rare event issue, and compute a meaningful odds ratio.

 

If we fit the same model in SAS using the code:

proc genmod;
class week(ref='0') gender;
model response=week gender age / link=logit dist=bin;
lsmeans week gender / ilink diff exp;
run; 

 

We will get the warning of "Hessian matrix is not positive definite" due to the "no event" or very rare event issue. The resulting odds ratios for week are ridiculously large numbers.

 

Is there any way to re-create omitting the 24 obs in SAS like it does in STATA? I would really like to obtain the same outputs for understanding SAS.

 

Thank you.

5 REPLIES 5
svh
Lapis Lazuli | Level 10 svh
Lapis Lazuli | Level 10
Do you just need a statement like this:
where week NE 2;
hellorc
Obsidian | Level 7

Thank you for your reply. Nope, after checking in STATA, there are ~600 obs with week=2, but only 24 are omitted in the model. Somehow there is a computed odds ratio for week=2 vs week=0 as 2.08 in STATA, while in SAS it's 10874311.

hellorc
Obsidian | Level 7
Thank you for your reply. Nope, after checking in STATA, there are ~600 obs with week=2, but only 24 are omitted in the model. Somehow there is a computed odds ratio for week=2 vs week=0 as 2.08 in STATA, while in SAS it's 10874311.
ballardw
Super User

Suggestion:

Provide a data set in the form of data step code so we can actually run your proc genmod code.

 

BTW, when you run code like this where you do not explicitly name the data set on the proc statement then SAS uses the last created data set. So there is a chance you may be running the procedure against a different data set than you think you are. While an occasionally useful feature for those that forget to provide the data set it is a best practice to always include the data=<use this data set> in your code.

 

proc genmod;
class week(ref='0') gender;
model response=week gender age / link=logit dist=bin;
lsmeans week gender / ilink diff exp;
run; 

 

hellorc
Obsidian | Level 7
Thank you! It helped, I just figured out the issue!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 686 views
  • 0 likes
  • 3 in conversation