03-30-2018 01:13 AM
Suppose we have a number of column vectors of data, each column representing a case. Suppose 10 such columns.
Rows represent characteristics. Suppose 50 such characteristics.
Some cells in the matrix are empty -- missing data of Case Xi on Characteristic Yj.
The question is, How can one describe the similarities among the given cases?
A parallel example, perhaps: Images of people. Two eyes, a nose. But, some short hair, some long, so not similar on that measure.
So, what are the similarities among the cases?
All suggestions and thoughts appreciated.
03-30-2018 10:42 AM
Likely the first step would be to transpose the data as most SAS procedures expect rows to represent observations (cases) and each column a separate characteristic.
What types of similarities are you looking for? Between individuals or groups of the characteristics.
Here is a brief example of comparing age and sex distributions using a supplied SAS data set that you should have available
(not claiming is the best example just one way)
proc freq data=sashelp.class; tables age*sex /chisq; run;
The chi square test would be testing if the distributions of sex are the same across age groups. A large Prob value (p-value) would indicate no difference while a small one would indicate the is some difference in the distribution of sex for age.
Show some example input and what the result would look like would be helpful. Approaches would vary depending on if the characteristics were categorical (sex for example) or continuous (height measurements) and if you mixing them.