turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Proc Logistic error: "All observations have the sa...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

02-06-2013 07:09 PM

Hi,

I ran a logistic regression on my dataset and got the following error:

ERROR: All observations have the same response. No statistics are computed.

Could it be so because I am modeling rare events? (~175 events out of 200,000+ observations)

Thanks for any advice

Accepted Solutions

Solution

02-07-2013
09:42 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to lavernal

02-07-2013 09:42 AM

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to lavernal

02-06-2013 08:59 PM

Yes, but more specifically I'd guess that there is some classification variable for which all the Y=1 belong to a single category. For example, if you are using CLASS GENDER, maybe all of the Y=1 values are male.

Try using PROC FREQ to cross tabulate Y with your classification varialbles. You might see an empty cell. That would indicate that you cannot use that classification variable as an explanatory variable.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

02-06-2013 11:47 PM

Hi Rick,

Thanks for the answer. I did not use any classification but instead, used proc logistic on my whole dataset. So there should not be a situation where there are all Y=1 or Y=0. That's also why I was confused that I got such an error.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to lavernal

02-07-2013 05:44 AM

Do you have a WHERE clause or other DATA= option that is filtering data? Or a format that coalesces values of an explanatory variable?

What does your MODEL statement look like?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

02-07-2013 06:38 AM

Hi Rick,

My model is really simple. My code is:

proc sort data=fun;

by id year quarter;

run;

proc logistic data=fun descending;

model flag= var1 var2 var3 var4 var5 var6 var7 var8 var9 var10 var11 var12;

output out=propensity_scores pred=prob_flag;

run;

I'm not sure where any filtering or coalescing may have occurred.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to lavernal

02-07-2013 06:48 AM

Hmm, interesting. And what do you get from running the following?

proc freq data=fun; tables flag; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

02-07-2013 06:57 AM

Hi Rick,

I get the following output:

Cumulative Cumulative

Flag Frequency Percent Frequency Percent

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

0 252784 99.93 252784 99.93

1 174 0.07 252958 100.00

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to lavernal

02-07-2013 08:05 AM

My guess is that there is a linear combination of your continuous variables that perfectly explain your response. For example, if you are modeling "had a heart attack," it might be that "blood pressure," "cholesterol," and "stress level" do not individually predict the response, but when you include all of those variables in the model you discover that all of the heart attacks in the data can be predicted by the linear combination of those factors.

Solution

02-07-2013
09:42 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to lavernal

02-07-2013 09:42 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to bobderr

02-07-2013 01:26 PM

Hi Bobderr,

Thanks for your advice. It seems that this is indeed the reason. When I removed some of the independent variables, the regression worked!