## Logistic regression: accounting known variation in dependent variable measurements.

Occasional Contributor
Posts: 13

# Logistic regression: accounting known variation in dependent variable measurements.

I work with logistic regression quite a bit, but I haven't come across anything in the SAS documentation that outlines how to do this. Here's a quick example data set. Let's say I'm measuring the failure rate of two pieces of equipment at different temperatures to see if an older model (type) has a significantly higher failure rate:

DATA equipmentfail;

INPUT type temp n failures  ;

DATALINES;

1 90 20 2

1 100 20 7

1 110 20 12

1 120 20 17

2 90 20 5

2 100 20 10

2 110 20 15

2 120 20 20;

proc logistic data=equipmentfail plots=effect;

class type;

model failures/n = temp type ;

run;

In this case, there is a significant difference in failure rates between equipment types. However, what I'm trying to figure out is what to do if the way you measure failure is biased. Let's say it is known that the method for measuring failure overpredicts failure by an average of 10% plus or minus 5% (95% confidence interval). At this point, you can reduce the number of failure events by the known average bias of 10%, but that doesn't account the variation around that average. Is there some way in SAS procs like PROC LOGISTIC to account for this variation, or is it something that likely has to done by hand? Thanks.

Posts: 5,052

## Re: Logistic regression: accounting known variation in dependent variable measurements.

The error process that you describe would result in a bias in the number of failures AND in extra variation in the observed number of failures. You can correct for the bias using your best estimate (10%) and check for extra variation with goodness of fit statistics. If the Deviance/DF ratio is greater than 1, you can account for overdispersion by adding the SCALE=Deviance option to your model statement (definitively not the case in your example data) :

DATA equipmentfail;
INPUT type temp n failures;
correctedFailures = round(0.9*failures);
DATALINES;
1 90 20 2
1 100 20 7
1 110 20 12
1 120 20 17
2 90 20 5
2 100 20 10
2 110 20 15
2 120 20 20
;

proc logistic data=equipmentfail plots=effect;
class type;
model correctedFailures/n = temp type / lackfit  /* scale=Deviance */ ;
run;

PG

PG
Regular Contributor
Posts: 152