BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GoBrewers14
Calcite | Level 5

I have a file of several hundred backgammon matches and I want to use SAS to calculate a rating for

each player, prior to each match, based on their past performance.  I have the date, the names of each

player, the outcome for black (1=Win for Black, 0=Win for White) and the number of points in

each match.  The file looks like this…

==============================================================================

Date                       Black                     White    W/L        Match Length    Black_Rating      White_Rating

==============================================================================

1/1/2001              Frank                     John      1                              7              100                         100

1/2/2001              Bill                          Jeff        0                              5              100                         100

1/4/2001              Jeff                        Bill          1                              3              101                         99

1/5/2001              John                      Frank     0                              5              99                           101

1/6/2001              Bill                          John      0                              9              100                         98

Etc…..

I want to calculate a modified version of the Elo system and start everyone’s rating at 100.  

So Frank and John have never played so both rating would be 100 and the same for the next row containing Bill and

Jeff’s match.  For simplicity, let’s say if you win a match you add 1 to your score and subtract 1 for a loss; so, row 3

would read 101 and 99 respectively. 

My problem is I want SAS to recognize who the two opponents are, and put their current ratings in the corresponding

columns then based on the outcome of the match, calculate and remember the new rating. 

I’ve tried using RETAIN but my problem arises from the fact that any person can be in the black or white column.

Any ideas?

Thanks,

John

1 ACCEPTED SOLUTION

Accepted Solutions
Vince28_Statcan
Quartz | Level 8

If your pool of player is reasonably small to divide into cases, you can use a _temporary_ array to retain the latest rating result of each player and update it & your data step variables at the same time.

Alternatively, to avoid having to know what array index refers to what player, you can use hash tables to do the same and simply search the hash by player name with the last value.

data want;

     if _n_=1 then do;

     declare hash h();

     h.defineKey('name');

     h.defineData('rating');

     h.defineDone;

     end;

     set have;

     rc=h.find(name: black);

     if rc NE 0 then do;

          name=black;

          rating=100;

          black_rating=rating;

          h.add();

     end; else do;

          black_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace();

     end;

     rc=h.find(name: white);

     if rc NE 0 then do;

          name=white

          rating=100;

          white_rating=rating;

          h.add();

     end; else do;

          white_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace()

     end;

     drop rc rating;

run;

This is untested but if you want to provide a small sample dataset through a datalines statement in the forums with the appropriate variable names that you use, I can edit to achieve your desired result. If you are not familiar with hash tables, they're just lookup tables. The find() method returns 0 if the key was found in the table and some error code other wise hence if rc NE 0 I'm initiating their rating to 100. Since you wanted the current record to list their rating prior to the match, the black_rating=rating; statement comes before the update to the rating variable.

Unless defined for multidata, hash tables can only retain one record per key. Thus, you must use .add method to add a new key/data pair to the table yet must use replace() method to update the data associated with an existing key.

Vincent

P.S. This should also be more efficient in terms of processing speed since the hash keeps everything in memory but since you only have a few hundred games it is somewhat irrelevant

View solution in original post

5 REPLIES 5
Reeza
Super User

Change your data structure to be one person per line, so one match is on two lines

Date                   Black                 WhiteW/L    Match LengthBlack_Rating  White_Rating

to

Date Player Colour Win Rating Length

RichardinOz
Quartz | Level 8

Transpose will not work in this context.

You would need to sort by player and date.  It helps to have a 1 / -1 Score instead of 1 / 0 Win/Loss.  Here is my suggestion (based on @Reeza);

Data Players;

    Length Player $32 ;

   Set Have ;

   Retain Rating 0 ;    /* Place holder */

   Keep Player Date Colour Score Match_Length Rating ;

   Player = White ;

    Colour = 'White' ;

    Score = 2 * W_L -1 ;

    Output ;

   Player = Black ;

   Colour = 'Black' ;

   Score = - Score ;

   Output ;

Run ;


Proc Sort

          Data = Players ;

    By   Player Date ;

Run ;


Data Players ;

    Set Players ;

    By  Player Date ;

    Prev_Rating = LAG (Rating) ;

    If  First.Player Then Rating = 100 ;

    Else Rating = SUM (Prev_Rating, Score) ;   /* Do not use LAG() in this statement */

   Drop Prev_Rating ;

Run ;


To get back to  the original format (assuming only one game played on each date, otherwise you will have to introduce a Game sequence column for multiple games per date)


Proc Sort Data = Players ;

          By Date Colour ;

Run ;

Data Want ;

     Set Have (Obs = 0) ;

     Retain Black White W_L Match_Length Black_Rating White_Rating ;

     Set Players ;

     Black = Player ;

     Black_Rating = Rating ;

     W_L = (Score + 1) / 2 ;

     Set Players ;

      White = Player ;

     White_ rating = Rating ;

     Output ;

Run ;

Alert - untested code ;

Richard

RichardinOz
Quartz | Level 8

Alternately for the last step you could do a merge of the data on itself :

Proc SQL ;

     Create table Have as

          Select b.Date

               ,     b.player as Black_Player

               ,     w.player as White_Player

               ,     (b.Score + 1) /2 As W_L

               ,     b.Match_Length

               ,     b.rating as Black_Rating

               ,     w.Rating as White_Rating

          From   Players (where = (colour = 'Black')) b

               ,     Players (where = (Colour = 'White)) w

          Where  w.date = b.date

          Order by date

          ;

Quit ;

Whatever

Richard

Vince28_Statcan
Quartz | Level 8

If your pool of player is reasonably small to divide into cases, you can use a _temporary_ array to retain the latest rating result of each player and update it & your data step variables at the same time.

Alternatively, to avoid having to know what array index refers to what player, you can use hash tables to do the same and simply search the hash by player name with the last value.

data want;

     if _n_=1 then do;

     declare hash h();

     h.defineKey('name');

     h.defineData('rating');

     h.defineDone;

     end;

     set have;

     rc=h.find(name: black);

     if rc NE 0 then do;

          name=black;

          rating=100;

          black_rating=rating;

          h.add();

     end; else do;

          black_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace();

     end;

     rc=h.find(name: white);

     if rc NE 0 then do;

          name=white

          rating=100;

          white_rating=rating;

          h.add();

     end; else do;

          white_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace()

     end;

     drop rc rating;

run;

This is untested but if you want to provide a small sample dataset through a datalines statement in the forums with the appropriate variable names that you use, I can edit to achieve your desired result. If you are not familiar with hash tables, they're just lookup tables. The find() method returns 0 if the key was found in the table and some error code other wise hence if rc NE 0 I'm initiating their rating to 100. Since you wanted the current record to list their rating prior to the match, the black_rating=rating; statement comes before the update to the rating variable.

Unless defined for multidata, hash tables can only retain one record per key. Thus, you must use .add method to add a new key/data pair to the table yet must use replace() method to update the data associated with an existing key.

Vincent

P.S. This should also be more efficient in terms of processing speed since the hash keeps everything in memory but since you only have a few hundred games it is somewhat irrelevant

GoBrewers14
Calcite | Level 5

I read up on Hash tables.  Very interesting.  I modified your code slightly and it worked.  Thanks for your help.

-John

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 2122 views
  • 6 likes
  • 4 in conversation