BookmarkSubscribeRSS Feed
hanson4022
Calcite | Level 5

I work with logistic regression quite a bit, but I haven't come across anything in the SAS documentation that outlines how to do this. Here's a quick example data set. Let's say I'm measuring the failure rate of two pieces of equipment at different temperatures to see if an older model (type) has a significantly higher failure rate:

DATA equipmentfail;

   INPUT type temp n failures  ;

   DATALINES;

1 90 20 2

1 100 20 7

1 110 20 12

1 120 20 17

2 90 20 5

2 100 20 10

2 110 20 15

2 120 20 20;

proc logistic data=equipmentfail plots=effect;

class type;

model failures/n = temp type ;

run;

In this case, there is a significant difference in failure rates between equipment types. However, what I'm trying to figure out is what to do if the way you measure failure is biased. Let's say it is known that the method for measuring failure overpredicts failure by an average of 10% plus or minus 5% (95% confidence interval). At this point, you can reduce the number of failure events by the known average bias of 10%, but that doesn't account the variation around that average. Is there some way in SAS procs like PROC LOGISTIC to account for this variation, or is it something that likely has to done by hand? Thanks.

2 REPLIES 2
PGStats
Opal | Level 21

The error process that you describe would result in a bias in the number of failures AND in extra variation in the observed number of failures. You can correct for the bias using your best estimate (10%) and check for extra variation with goodness of fit statistics. If the Deviance/DF ratio is greater than 1, you can account for overdispersion by adding the SCALE=Deviance option to your model statement (definitively not the case in your example data) :

DATA equipmentfail;
INPUT type temp n failures;
correctedFailures = round(0.9*failures);   
DATALINES;
1 90 20 2
1 100 20 7
1 110 20 12
1 120 20 17
2 90 20 5
2 100 20 10
2 110 20 15
2 120 20 20
;

proc logistic data=equipmentfail plots=effect;
class type;
model correctedFailures/n = temp type / lackfit  /* scale=Deviance */ ;
run;

PG

PG
1zmm
Quartz | Level 8

Read the following references: Magder LS, Hughes JP.  Logistic regression when the outcome is measured with uncertainty.   American Journal of Epidemiology 1997;146(2):195-203. Neuhaus JM.  Bias and efficiency loss due to misclassified responses in binary regression.   Biometrika 1999;86(4):843-855.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1269 views
  • 0 likes
  • 3 in conversation