turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Logistic regression: accounting known variation in...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-19-2013 07:06 PM

I work with logistic regression quite a bit, but I haven't come across anything in the SAS documentation that outlines how to do this. Here's a quick example data set. Let's say I'm measuring the failure rate of two pieces of equipment at different temperatures to see if an older model (type) has a significantly higher failure rate:

DATA equipmentfail;

INPUT type temp n failures ;

DATALINES;

1 90 20 2

1 100 20 7

1 110 20 12

1 120 20 17

2 90 20 5

2 100 20 10

2 110 20 15

2 120 20 20;

proc logistic data=equipmentfail plots=effect;

class type;

model failures/n = temp type ;

run;

In this case, there is a significant difference in failure rates between equipment types. However, what I'm trying to figure out is what to do if the way you measure failure is biased. Let's say it is known that the method for measuring failure overpredicts failure by an average of 10% plus or minus 5% (95% confidence interval). At this point, you can reduce the number of failure events by the known average bias of 10%, but that doesn't account the variation around that average. Is there some way in SAS procs like PROC LOGISTIC to account for this variation, or is it something that likely has to done by hand? Thanks.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to hanson4022

07-20-2013 01:24 PM

The error process that you describe would result in a bias in the number of failures AND in extra variation in the observed number of failures. You can correct for the bias using your best estimate (10%) and check for extra variation with goodness of fit statistics. If the Deviance/DF ratio is greater than 1, you can account for overdispersion by adding the SCALE=Deviance option to your model statement (definitively not the case in your example data) :

**DATA equipmentfail; ****INPUT type temp n failures;****correctedFailures = round(0.9*failures); ****DATALINES; ****1 90 20 2****1 100 20 7****1 110 20 12****1 120 20 17****2 90 20 5****2 100 20 10****2 110 20 15****2 120 20 20****; **

**proc logistic data=equipmentfail plots=effect;****class type;****model correctedFailures/n = temp type / lackfit /* scale=Deviance */ ;****run;**

PG

PG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to hanson4022

07-21-2013 11:59 AM

Read the following references: Magder LS, Hughes JP. Logistic regression when the outcome is measured with uncertainty. American Journal of Epidemiology 1997;146(2):195-203. Neuhaus JM. Bias and efficiency loss due to misclassified responses in binary regression. Biometrika 1999;86(4):843-855.