I have a dummy data set of 500 observations and I am trying to fill in values for test result 1 (positive or negative), using test result 2 as predictor (positive and negative). I tried two procedures:
1) Randomly assign 30% to be testing and 70% as training and use HPLOGISTIC;
2) Assign that 30% to be missing and use PROC MI to impute the missing result.
The results from the two procedures are very different. HPLOGISTICS has high Sensitivity and low Specificity, while PROC MI gave the reverse (low sensitivity and high specificity).
If I generate a contingency tables between test result 2 (the predictor) and the truth (the complete original test result 1), result from HPLOGISTIC makes sense and PROC MI does not. I would really like to know why.
Please post the SAS programs you are using. Also, are the missing values in the explanatory variables (X) or in the response variable (Y)?
Missing values are in Y (Result_1) and here are the two programs:
proc hplogistic data = Dat;
class Result_2 /param = ref;
partition role = ROLE(test = 'Test' train = 'Train');
model Result_1 (event = 'Positive') = Result_2;
run;
proc mi data = Dat seed = 123456 nimpute=100 out = Impute noprint;
class Result_1 Result_2;
fcs discrim(Result_1/details classeffects = include);
var Result_2 Result_1;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.