01-18-2013 11:30 AM
I am trying to analyze some ranked data, but I am not sure of the best way to go about it.
I have a variable that has 34 different choices. The respondent is to select their top 5 choices and rank them in order of preference (1 is highest preference, 2 is second preference, etc.). My goal is to compare respondents and see if anyone is similar to someone else. I can easily tell if a certain choice is chosen more often than others by running a frequency table in PROC FREQ, but I am not sure how to incorporate the ranks. Does anyone know of a statistical approach to solve this analysis? I really have no idea where to even begin.
01-19-2013 04:34 PM
I don't have an answer to your problem. Your problem requires that you specify what you mean by persons being similar to one another.
If you have 34 choices, and if you select 5 of them to rank from 1 to 5, the total number of such ranked choices is the number of permutations of 34 items taken 5 at a time = 34!/(34-5)! = 34*33*32*31*30 = 33,390,720. Because these permutations preserve the order of the ranked choices, two persons could select the same set of five items but rank each of these items differently so that they would be similar/identical in their choices but not similar in how they ranked these choices. Thus, you should specify how you determine how someone is "similar" to someone else and which similarity criterion is most important to you--similarity in the choices or similarity in the ranking of those choices. For example, persons A and B could have selected the same five choices but have ranked them in an exact opposite order; person C could have selected four of the same five choices as person B and ranked those four choices identically to person B's ranking of these four choices. Is person A more similar than person C to person B? Or, vice-versa?