Help using Base SAS procedures

Determine top five winners (Part 2) -- Breakdown of Pairs

Reply
Regular Contributor
Posts: 212

Determine top five winners (Part 2) -- Breakdown of Pairs

Last time I asked the question with SINGLE individuals competing in each of several events, and how to determine the overall ranking of winners.

https://communities.sas.com/thread/44071

(thanks , , , , )

Now I return with things a little more complicated:  Suppose we have PAIRS in each contestant group.

I would like to find a way of ranking the top INDIVIDUALS, even though they are taking part as a PAIR.

In this competition, every individual was paired up round-robin, with every other individual.  This was repeated seven times.... over a period of one year.  We have the list of winners for each round, seven rounds in all.

Unfortunately some pairs missed certain rounds.  Thus, not a complete and even data set.

One might guess that the top PAIR would include the, say, #1 and #2 contestants overall.  My hunch is that such need not be the case.  Could be that a single contestant from each of Pair 1, Pair 2, Pair 3 might be the top rankers overall.

So, again, what we want is a ranking list of all INDIVIDUALS, based on the data set.

Any help and advice is greatly appreciated.

Thanks,

Nicholas Kormanik

=======

Edit by Nicholas:

I've added a new data file.  In it I've separated the 'Pairs' into two columns each, so that one needn't do it in SAS.

'Round 1' becomes 'R1a' and 'R1b', etc.

I tried using Vince28's code and kept getting SAS errors.  Bummer.  Maybe by separating the 'Pairs' the task can be handled more easily.

Again, any help greatly appreciated.

=======

Data set follows (and additionally as an attachment):

OrderRound 1Round 2Round 3Round 4Round 5Round 6Round 7
1DREW_EDGARHENRY_ISAACEDGAR_GRANTCHUCK_GRANTHENRY_ISAACEDGAR_ISAACAARON_FRANK
2GRANT_ISAACCHUCK_HENRYBLAKE_DREWGRANT_HENRYAARON_EDGAREDGAR_HENRYDREW_EDGAR
3DREW_HENRYDREW_ISAACAARON_BLAKEBLAKE_CHUCKFRANK_HENRYAARON_ISAACCHUCK_DREW
4CHUCK_HENRYAARON_EDGARAARON_CHUCKDREW_GRANTAARON_DREWBLAKE_CHUCKFRANK_GRANT
5CHUCK_FRANKFRANK_ISAACAARON_GRANTAARON_GRANTDREW_GRANTFRANK_GRANTAARON_EDGAR
6AARON_EDGARDREW_FRANKBLAKE_ISAACEDGAR_GRANTBLAKE_FRANKDREW_FRANKEDGAR_ISAAC
7BLAKE_GRANTCHUCK_FRANKDREW_GRANTDREW_HENRYBLAKE_HENRYBLAKE_ISAACFRANK_HENRY
8AARON_CHUCKAARON_ISAACGRANT_ISAACEDGAR_ISAACBLAKE_GRANTDREW_ISAACBLAKE_HENRY
9EDGAR_GRANTAARON_HENRYBLAKE_CHUCKAARON_BLAKEAARON_ISAACFRANK_HENRYAARON_BLAKE
10CHUCK_ISAACFRANK_HENRYAARON_FRANKBLAKE_ISAACCHUCK_ISAACAARON_CHUCKCHUCK_ISAAC
11AARON_BLAKEDREW_GRANTBLAKE_HENRYEDGAR_HENRYCHUCK_FRANKDREW_HENRYCHUCK_HENRY
12BLAKE_CHUCKBLAKE_ISAACAARON_EDGARDREW_FRANKDREW_ISAACAARON_GRANTEDGAR_GRANT
13FRANK_HENRYBLAKE_EDGARFRANK_ISAACEDGAR_FRANKFRANK_ISAACGRANT_HENRYBLAKE_DREW
14BLAKE_DREWEDGAR_GRANTDREW_HENRYDREW_EDGARDREW_HENRYEDGAR_GRANTCHUCK_EDGAR
15BLAKE_HENRYAARON_DREWDREW_ISAACAARON_FRANKAARON_HENRYBLAKE_EDGARAARON_ISAAC
16CHUCK_DREWCHUCK_GRANTBLAKE_GRANTCHUCK_EDGARBLAKE_CHUCKBLAKE_DREWDREW_GRANT
17FRANK_GRANTCHUCK_ISAACGRANT_HENRYCHUCK_DREWEDGAR_ISAACDREW_EDGARBLAKE_CHUCK
18FRANK_ISAACAARON_CHUCKEDGAR_FRANKCHUCK_HENRYEDGAR_GRANTAARON_FRANKGRANT_HENRY
19GRANT_HENRYBLAKE_HENRYAARON_ISAACFRANK_GRANTCHUCK_GRANTBLAKE_HENRYBLAKE_ISAAC
20DREW_FRANKAARON_GRANTFRANK_HENRYBLAKE_DREWDREW_FRANKCHUCK_GRANTDREW_ISAAC
21BLAKE_ISAACBLAKE_DREWAARON_HENRYCHUCK_FRANKCHUCK_DREWCHUCK_ISAACAARON_CHUCK
22AARON_DREWGRANT_ISAACCHUCK_EDGARHENRY_ISAACGRANT_HENRYAARON_EDGARGRANT_ISAAC
23AARON_GRANTDREW_EDGARCHUCK_GRANTGRANT_ISAACCHUCK_HENRYCHUCK_DREWEDGAR_HENRY
24HENRY_ISAACEDGAR_HENRYCHUCK_FRANKCHUCK_ISAACCHUCK_EDGARHENRY_ISAACBLAKE_EDGAR
25EDGAR_HENRYFRANK_GRANTDREW_FRANKBLAKE_GRANTAARON_FRANKAARON_HENRYHENRY_ISAAC
26CHUCK_EDGARAARON_FRANKBLAKE_FRANKAARON_HENRYGRANT_ISAACCHUCK_FRANKBLAKE_GRANT
27EDGAR_ISAACDREW_HENRYEDGAR_HENRYFRANK_ISAACAARON_GRANTFRANK_ISAACBLAKE_FRANK
28CHUCK_GRANTAARON_BLAKECHUCK_HENRYAARON_DREWFRANK_GRANTCHUCK_HENRYFRANK_ISAAC
29DREW_GRANTCHUCK_EDGARHENRY_ISAACBLAKE_FRANKBLAKE_EDGARAARON_DREWAARON_DREW
30DREW_ISAACCHUCK_DREWCHUCK_ISAACAARON_CHUCKBLAKE_ISAACDREW_GRANTDREW_FRANK
31AARON_HENRYGRANT_HENRYCHUCK_DREWDREW_ISAACDREW_EDGARBLAKE_FRANKCHUCK_GRANT
32BLAKE_FRANKEDGAR_ISAACBLAKE_EDGARAARON_ISAACAARON_BLAKECHUCK_EDGARAARON_HENRY
33AARON_FRANKBLAKE_FRANKDREW_EDGARBLAKE_HENRYEDGAR_HENRYBLAKE_GRANTDREW_HENRY
34BLAKE_EDGAREDGAR_FRANKEDGAR_ISAACFRANK_HENRYAARON_CHUCKAARON_BLAKEAARON_GRANT
35AARON_ISAACBLAKE_GRANTAARON_DREW*BLAKE_DREWGRANT_ISAACCHUCK_FRANK
36*BLAKE_CHUCKFRANK_GRANT*EDGAR_FRANK*EDGAR_FRANK
Super Contributor
Posts: 339

Re: Determine top five winners (Part 2) -- Breakdown of Pairs

Hi Nicholas,

I just posted a hash table solution that could accomodate your need, at least if I understand properly, in the following thread https://communities.sas.com/message/176069#176069

Obviously, some tweaks would need to be made.

data _null_;

     if _n_=1 then do;

     declare hash h();

     h.defineKey('name');

     h.defineData('name', 'points');

     h.defineDone;

     end;

     set have;

     array round {*} round_1-round_7; /* This effectively allows us to use loops on the 7 rounds rather than hard code them */

     do i=1 to 7;

          if trim(round{i}) NE "*" then do;

          win1 = scan(round{i}, 1, "_");

          win2 = scan(round{i}, 2, "_");

          roundpoints = max(0, 10-order+1);

          rc=h.find(name: win1);

          if rc NE 0 then do;

               name=win1;

               points=roundpoints;

               h.add();

          end; else do;

               points=points+roundpoints;

               h.replace();

          end;

          rc=h.find(name: win2);

          if rc NE 0 then do;

               name=win2;

               points=roundpoints;

               h.add();

          end; else do;

               points=points+roundpoints;

               h.replace();

          end;

          end;

     end;  

     rc=h.output(dataset: "work.want");

     drop rc win1 win2 points;

run;

Here, you are using a hash object to add a new member each time a new name is encountered in the dataset initiating his "wins" value to 1. Afterwards, each time the same name shows in the DS, the 'wins' value is incremented by 1. The resulting data is stored in the hash object hence, the "want" data is output via the hash output method. This will create a dataset with columns 'name' and 'wins'.

It is currently untested but the logic and syntax element is all there.

Vincent

*Edit*

Reading Astounding's comment and going through the post you've linked, I believe I had misinterpreted your data. I believe the biggest question which we can't answer for you is how do you measure performance in each round. For instance, you could consider the  sum of all orders for a given individual to be the best indicator of his performances (ascending). Or, similar to, say, F1 standings, you can attribute a score value per position and set everything below a certain position to 0 and then take the total score per round or the best score per round per individual etc.

For the sake of providing a better code sample, I've updated the above code to output a dataset that "scores" each round as follow: Top 10 positions gives 10/9.../1 points resp. The sum of all points on all rounds gives the score of an individual. You can then simply sort the Want dataset by descending order to rank players from best to worst.

Super User
Posts: 5,099

Re: Determine top five winners (Part 2) -- Breakdown of Pairs

Well, if you already have a solution for individuals instead of pairs, you might piggyback on that.  For the pairs data, turn each observation into four observations:  (A + B) vs. (C + D) becomes A vs. C, A vs. D, B vs. C, and B vs. D.  Then apply your original solutions for individuals, but count each win as 0.5 instead of 1.

Good luck.

Respected Advisor
Posts: 2,655

Re: Determine top five winners (Part 2) -- Breakdown of Pairs

This is a great idea, as it is immediately generalizable to n-tuples on the one hand, and to weighted rankings based on opponents' record on the other.

Steve Denham

Super Contributor
Posts: 339

Re: Determine top five winners (Part 2) -- Breakdown of Pairs

I'm not sure to follow or rather, the way I understand the data in the original post, I don't think each records are presenting matches of a pair vs another pair. It reads to me more like a "race against the clock" by pair and then each pair is ranked based on the time they put up.

Maybe I really misinterpreted the data but given that there are 8 distinct names or competitors and thus sum_(i=1)^(n-1) i = n*(n+1)/2 = 36 distinct pairs and that for each round robin, there are 36 records minus missings, this truly reads to me as though each round results are ordered by best pair performance to worst pair performance similar to the race against the clock concept. I can't seem to think of any team v team based competition in a round robin setup of all distinct pairs matchup that could result in 36 single matchup winners.

If you considered a round robin of matches with a single winer on a multinom(8,2,2,2,2) scenario, you'd have to have  7*6*5*4*3=2520 matches or 1/3 of that if the records represented the winner of a round robin within a round robin to effectively give even chances to each competitor of facing every other possible team configuration within a given team.

Anyway I'll watch the thread closely I'm quite intrigued by the concept

Vincent

Regular Contributor
Posts: 212

Re: Determine top five winners (Part 2) -- Breakdown of Pairs

Thanks Vincent , , for the input.

Vincent asks, "How do we measure performance in each round?"

Assume only what's given -- the 'Order' the 'Pair' came in for each 'Round'.  Seems reasonable to sum 'Order' across all 'Rounds' for each individual, and compare grand sums, ascending, for overall rank orderings.

It should be pointed out that the data set here was created just for this post.  The actual data set is far larger -- more than 100 'Contestants' x 100 'Rounds'.  Thus Astounding's suggestion may not be practicable, because the resulting numbers of combinations would be so large. I'm not sure how to code it anyway.  If Astounding or Steve wants to take a crack at it.....  Super.

A "race against the clock" is one good way of viewing the contest.  Thanks Vincent.  For pairs..., lowest time sum wins.  Pairs, along this line, could be broken down as well.  Lowest grand time sum wins.  Of course, we lack the 'Time' data.  Only have the 'Rounds' rankings.

https://communities.sas.com/people/Vince28%40Statcan

Regular Contributor
Posts: 212

Re: Determine top five winners (Part 2) -- Breakdown of Pairs

An additional question I'd like to ask regarding the data set is:

Is it possible to derive any additional information from the results, besides the primary objective of determining individual 'winners'?

Any procedure in SAS come to mind, to bring out a pattern, that is not so obvious, but that statistically is present?

Regular Contributor
Posts: 212

Re: Determine top five winners (Part 2) -- Breakdown of Pairs

Finally, as a motivator of sorts, let's imagine the data set representing professional basketball. And the top five contestants will be awarded multi-million dollar contracts.

Huge money awaits the grand winners.

All up to our statistical pronouncements.

Ask a Question
Discussion stats
  • 7 replies
  • 235 views
  • 0 likes
  • 4 in conversation