My dataset has 10 variables and 2000 cases. All variables are continuous. I would like to "standardize" each variable column. Then average those 10 columns, and compare the summary averages for each case, say, sorting from high to low.
I know that several variables data is bi-modal, as opposed to centered, with more data occurring at the extremes.
I'm wondering what the best standardization method might be. SAS offers several. STD, MAD, IQR, ABW, and others.
STD is common -- converting to Z-score: (X1 - mean of X1)/standard deviation of X1. Some of the others are apparently more 'robust,' however, with respect to outliers, and, I suppose, certain other data anomilies.
I'm tentatively thinking of using one of the more esoteric 'robust' ones, such as IQR, based on an example given in SAS documentation.
http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/viewer.htm#statug_stdize_gettingstarted.htm
I'd greatly appreciate hearing your thoughts or suggestions on how best to proceed.
Nicholas Kormanik
... View more