Dear community,
I am trying to figure it out how to perform an imputation through donor of minimum distance within groups.
unfortunately it seems that proc survey impute does not perform that (only random imputation).
is there another direct option?
in case, can you suggest appropriate ways to do it?
one issue is that the data set is very large with about 30 variables and 20-30 millions of records.
any hint is greatly appreciated.
thank you very much in advance
I may not understand all of the constraints for minimum distance donor imputation, but it sounds a lot like what is referred to in PROC MI as fully conditional specification (FCS) predictive mean matching. I based this on the Details section on this method in the PROC MI documentation. It looks to me like a predicted mean for the missing value is estimated via regression, and then the K closest values are used as a basis set from which a value is randomly selected. Does that fit? I suppose you could find the minimum distance replacement value by setting K=1. The last two paragraphs in the Details point out the advantages/disadvantages of large and small K, and seem to imply that this method is more robust to the assumption of normality.
SteveDenham
Thank you Steve for the quick reply.
I had a brief look into it. However the method I am trying to apply would require to choose the donors based on actual measures of variables instead that of a sinthetic measure such as the predictive mean. Moreover I would like to have a dataset with the Id of the chosen donors for each of recipient.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.