I have a dataset with about 10 independent variables and one dichotomus dependent variable. I have done most of the EDA on the dataset, removing extreme values, standardizing input variables, imputing missing values, testing for collinearity, etc. Regardless of how much I clean my data, my logit model keeps failing HL goodness of fit test. The ROC is good at .82, outliers were removed after I checked the leverage, displacements, etc. plots, and the other association stats look pretty decent. I can't seem to figure out why HL is so bad. I even sorted the input dataset several different ways to see if the grouping was the culprit, with no avail. Any ideas?
... View more