Background: I need to compute Mahalanobis distance for a dataset with about 200 observations and 17 variables. The goal is to identify 10 closest observations for each row. Variables have very different scales, and there are no missing values. Variables are of un-equal 'importance'. As recommended in: 30662 - Mahalanobis distance: from each observation to the mean, from each observation to a specific observation, between all possible pairs, I used Proc PRINCOMP with std option and used prin1:prin17 for computing Euclidean distance. As expected first three components accounted for most of the variance. In particular, the first two were correlated highly with the key variables. It seems that the weight option in the var statement could be used to assign greater importance to the first three components. However, there does not seem to be a rational basis for picking reasonable weight values. Any feedback, suggestions will be greatly appreciated. Thanks. RT
... View more