Re: Determine top five winners (Part 2) -- Breakdown of Pairs

NKormanik · Posted 08-01-2013 01:07 AM

Last time I asked the question with SINGLE individuals competing in each of several events, and how to determine the overall ranking of winners.

https://communities.sas.com/thread/44071

(thanks , , , , )

Now I return with things a little more complicated: Suppose we have PAIRS in each contestant group.

I would like to find a way of ranking the top INDIVIDUALS, even though they are taking part as a PAIR.

In this competition, every individual was paired up round-robin, with every other individual. This was repeated seven times.... over a period of one year. We have the list of winners for each round, seven rounds in all.

Unfortunately some pairs missed certain rounds. Thus, not a complete and even data set.

One might guess that the top PAIR would include the, say, #1 and #2 contestants overall. My hunch is that such need not be the case. Could be that a single contestant from each of Pair 1, Pair 2, Pair 3 might be the top rankers overall.

So, again, what we want is a ranking list of all INDIVIDUALS, based on the data set.

Any help and advice is greatly appreciated.

Thanks,

Nicholas Kormanik

=======

Edit by Nicholas:

I've added a new data file. In it I've separated the 'Pairs' into two columns each, so that one needn't do it in SAS.

'Round 1' becomes 'R1a' and 'R1b', etc.

I tried using Vince28's code and kept getting SAS errors. Bummer. Maybe by separating the 'Pairs' the task can be handled more easily.

Again, any help greatly appreciated.

=======

Data set follows (and additionally as an attachment):

Order	Round 1	Round 2	Round 3	Round 4	Round 5	Round 6	Round 7
1	DREW_EDGAR	HENRY_ISAAC	EDGAR_GRANT	CHUCK_GRANT	HENRY_ISAAC	EDGAR_ISAAC	AARON_FRANK
2	GRANT_ISAAC	CHUCK_HENRY	BLAKE_DREW	GRANT_HENRY	AARON_EDGAR	EDGAR_HENRY	DREW_EDGAR
3	DREW_HENRY	DREW_ISAAC	AARON_BLAKE	BLAKE_CHUCK	FRANK_HENRY	AARON_ISAAC	CHUCK_DREW
4	CHUCK_HENRY	AARON_EDGAR	AARON_CHUCK	DREW_GRANT	AARON_DREW	BLAKE_CHUCK	FRANK_GRANT
5	CHUCK_FRANK	FRANK_ISAAC	AARON_GRANT	AARON_GRANT	DREW_GRANT	FRANK_GRANT	AARON_EDGAR
6	AARON_EDGAR	DREW_FRANK	BLAKE_ISAAC	EDGAR_GRANT	BLAKE_FRANK	DREW_FRANK	EDGAR_ISAAC
7	BLAKE_GRANT	CHUCK_FRANK	DREW_GRANT	DREW_HENRY	BLAKE_HENRY	BLAKE_ISAAC	FRANK_HENRY
8	AARON_CHUCK	AARON_ISAAC	GRANT_ISAAC	EDGAR_ISAAC	BLAKE_GRANT	DREW_ISAAC	BLAKE_HENRY
9	EDGAR_GRANT	AARON_HENRY	BLAKE_CHUCK	AARON_BLAKE	AARON_ISAAC	FRANK_HENRY	AARON_BLAKE
10	CHUCK_ISAAC	FRANK_HENRY	AARON_FRANK	BLAKE_ISAAC	CHUCK_ISAAC	AARON_CHUCK	CHUCK_ISAAC
11	AARON_BLAKE	DREW_GRANT	BLAKE_HENRY	EDGAR_HENRY	CHUCK_FRANK	DREW_HENRY	CHUCK_HENRY
12	BLAKE_CHUCK	BLAKE_ISAAC	AARON_EDGAR	DREW_FRANK	DREW_ISAAC	AARON_GRANT	EDGAR_GRANT
13	FRANK_HENRY	BLAKE_EDGAR	FRANK_ISAAC	EDGAR_FRANK	FRANK_ISAAC	GRANT_HENRY	BLAKE_DREW
14	BLAKE_DREW	EDGAR_GRANT	DREW_HENRY	DREW_EDGAR	DREW_HENRY	EDGAR_GRANT	CHUCK_EDGAR
15	BLAKE_HENRY	AARON_DREW	DREW_ISAAC	AARON_FRANK	AARON_HENRY	BLAKE_EDGAR	AARON_ISAAC
16	CHUCK_DREW	CHUCK_GRANT	BLAKE_GRANT	CHUCK_EDGAR	BLAKE_CHUCK	BLAKE_DREW	DREW_GRANT
17	FRANK_GRANT	CHUCK_ISAAC	GRANT_HENRY	CHUCK_DREW	EDGAR_ISAAC	DREW_EDGAR	BLAKE_CHUCK
18	FRANK_ISAAC	AARON_CHUCK	EDGAR_FRANK	CHUCK_HENRY	EDGAR_GRANT	AARON_FRANK	GRANT_HENRY
19	GRANT_HENRY	BLAKE_HENRY	AARON_ISAAC	FRANK_GRANT	CHUCK_GRANT	BLAKE_HENRY	BLAKE_ISAAC
20	DREW_FRANK	AARON_GRANT	FRANK_HENRY	BLAKE_DREW	DREW_FRANK	CHUCK_GRANT	DREW_ISAAC
21	BLAKE_ISAAC	BLAKE_DREW	AARON_HENRY	CHUCK_FRANK	CHUCK_DREW	CHUCK_ISAAC	AARON_CHUCK
22	AARON_DREW	GRANT_ISAAC	CHUCK_EDGAR	HENRY_ISAAC	GRANT_HENRY	AARON_EDGAR	GRANT_ISAAC
23	AARON_GRANT	DREW_EDGAR	CHUCK_GRANT	GRANT_ISAAC	CHUCK_HENRY	CHUCK_DREW	EDGAR_HENRY
24	HENRY_ISAAC	EDGAR_HENRY	CHUCK_FRANK	CHUCK_ISAAC	CHUCK_EDGAR	HENRY_ISAAC	BLAKE_EDGAR
25	EDGAR_HENRY	FRANK_GRANT	DREW_FRANK	BLAKE_GRANT	AARON_FRANK	AARON_HENRY	HENRY_ISAAC
26	CHUCK_EDGAR	AARON_FRANK	BLAKE_FRANK	AARON_HENRY	GRANT_ISAAC	CHUCK_FRANK	BLAKE_GRANT
27	EDGAR_ISAAC	DREW_HENRY	EDGAR_HENRY	FRANK_ISAAC	AARON_GRANT	FRANK_ISAAC	BLAKE_FRANK
28	CHUCK_GRANT	AARON_BLAKE	CHUCK_HENRY	AARON_DREW	FRANK_GRANT	CHUCK_HENRY	FRANK_ISAAC
29	DREW_GRANT	CHUCK_EDGAR	HENRY_ISAAC	BLAKE_FRANK	BLAKE_EDGAR	AARON_DREW	AARON_DREW
30	DREW_ISAAC	CHUCK_DREW	CHUCK_ISAAC	AARON_CHUCK	BLAKE_ISAAC	DREW_GRANT	DREW_FRANK
31	AARON_HENRY	GRANT_HENRY	CHUCK_DREW	DREW_ISAAC	DREW_EDGAR	BLAKE_FRANK	CHUCK_GRANT
32	BLAKE_FRANK	EDGAR_ISAAC	BLAKE_EDGAR	AARON_ISAAC	AARON_BLAKE	CHUCK_EDGAR	AARON_HENRY
33	AARON_FRANK	BLAKE_FRANK	DREW_EDGAR	BLAKE_HENRY	EDGAR_HENRY	BLAKE_GRANT	DREW_HENRY
34	BLAKE_EDGAR	EDGAR_FRANK	EDGAR_ISAAC	FRANK_HENRY	AARON_CHUCK	AARON_BLAKE	AARON_GRANT
35	AARON_ISAAC	BLAKE_GRANT	AARON_DREW	*	BLAKE_DREW	GRANT_ISAAC	CHUCK_FRANK
36	*	BLAKE_CHUCK	FRANK_GRANT	*	EDGAR_FRANK	*	EDGAR_FRANK

Vince28_Statcan · Posted 08-01-2013 09:15 AM

Hi Nicholas,

I just posted a hash table solution that could accomodate your need, at least if I understand properly, in the following thread https://communities.sas.com/message/176069#176069

Obviously, some tweaks would need to be made.

data _null_;

if _n_=1 then do;

declare hash h();

h.defineKey('name');

h.defineData('name', 'points');

h.defineDone;

end;

set have;

array round {*} round_1-round_7; /* This effectively allows us to use loops on the 7 rounds rather than hard code them */

do i=1 to 7;

if trim(round{i}) NE "*" then do;

win1 = scan(round{i}, 1, "_");

win2 = scan(round{i}, 2, "_");

roundpoints = max(0, 10-order+1);

rc=h.find(name: win1);

if rc NE 0 then do;

name=win1;

points=roundpoints;

h.add();

end; else do;

points=points+roundpoints;

h.replace();

end;

rc=h.find(name: win2);

if rc NE 0 then do;

name=win2;

points=roundpoints;

h.add();

end; else do;

points=points+roundpoints;

h.replace();

end;

rc=h.output(dataset: "work.want");

drop rc win1 win2 points;

run;

Here, you are using a hash object to add a new member each time a new name is encountered in the dataset initiating his "wins" value to 1. Afterwards, each time the same name shows in the DS, the 'wins' value is incremented by 1. The resulting data is stored in the hash object hence, the "want" data is output via the hash output method. This will create a dataset with columns 'name' and 'wins'.

It is currently untested but the logic and syntax element is all there.

Vincent

*Edit*

Reading Astounding's comment and going through the post you've linked, I believe I had misinterpreted your data. I believe the biggest question which we can't answer for you is how do you measure performance in each round. For instance, you could consider the sum of all orders for a given individual to be the best indicator of his performances (ascending). Or, similar to, say, F1 standings, you can attribute a score value per position and set everything below a certain position to 0 and then take the total score per round or the best score per round per individual etc.

For the sake of providing a better code sample, I've updated the above code to output a dataset that "scores" each round as follow: Top 10 positions gives 10/9.../1 points resp. The sum of all points on all rounds gives the score of an individual. You can then simply sort the Want dataset by descending order to rank players from best to worst.

Astounding · Posted 08-01-2013 10:00 AM

Well, if you already have a solution for individuals instead of pairs, you might piggyback on that. For the pairs data, turn each observation into four observations: (A + B) vs. (C + D) becomes A vs. C, A vs. D, B vs. C, and B vs. D. Then apply your original solutions for individuals, but count each win as 0.5 instead of 1.

Good luck.

SteveDenham · Posted 08-01-2013 12:57 PM

This is a great idea, as it is immediately generalizable to n-tuples on the one hand, and to weighted rankings based on opponents' record on the other.

Steve Denham

Vince28_Statcan · Posted 08-01-2013 01:40 PM

I'm not sure to follow or rather, the way I understand the data in the original post, I don't think each records are presenting matches of a pair vs another pair. It reads to me more like a "race against the clock" by pair and then each pair is ranked based on the time they put up.

Maybe I really misinterpreted the data but given that there are 8 distinct names or competitors and thus sum_(i=1)^(n-1) i = n*(n+1)/2 = 36 distinct pairs and that for each round robin, there are 36 records minus missings, this truly reads to me as though each round results are ordered by best pair performance to worst pair performance similar to the race against the clock concept. I can't seem to think of any team v team based competition in a round robin setup of all distinct pairs matchup that could result in 36 single matchup winners.

If you considered a round robin of matches with a single winer on a multinom(8,2,2,2,2) scenario, you'd have to have 7*6*5*4*3=2520 matches or 1/3 of that if the records represented the winner of a round robin within a round robin to effectively give even chances to each competitor of facing every other possible team configuration within a given team.

Anyway I'll watch the thread closely I'm quite intrigued by the concept

Vincent

NKormanik · Posted 08-02-2013 12:18 AM

Thanks Vincent , , for the input.

Vincent asks, "How do we measure performance in each round?"

Assume only what's given -- the 'Order' the 'Pair' came in for each 'Round'. Seems reasonable to sum 'Order' across all 'Rounds' for each individual, and compare grand sums, ascending, for overall rank orderings.

It should be pointed out that the data set here was created just for this post. The actual data set is far larger -- more than 100 'Contestants' x 100 'Rounds'. Thus Astounding's suggestion may not be practicable, because the resulting numbers of combinations would be so large. I'm not sure how to code it anyway. If Astounding or Steve wants to take a crack at it..... Super.

A "race against the clock" is one good way of viewing the contest. Thanks Vincent. For pairs..., lowest time sum wins. Pairs, along this line, could be broken down as well. Lowest grand time sum wins. Of course, we lack the 'Time' data. Only have the 'Rounds' rankings.

https://communities.sas.com/people/Vince28%40Statcan

NKormanik · Posted 08-02-2013 12:23 AM

An additional question I'd like to ask regarding the data set is:

Is it possible to derive any additional information from the results, besides the primary objective of determining individual 'winners'?

Any procedure in SAS come to mind, to bring out a pattern, that is not so obvious, but that statistically is present?

NKormanik · Posted 08-02-2013 12:27 AM

Finally, as a motivator of sorts, let's imagine the data set representing professional basketball. And the top five contestants will be awarded multi-million dollar contracts.

Huge money awaits the grand winners.

All up to our statistical pronouncements.

SAS Innovate 2025: Call for Content

Classroom Training Available!