Retaining Values

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Retaining Values

I have a file of several hundred backgammon matches and I want to use SAS to calculate a rating for

each player, prior to each match, based on their past performance.  I have the date, the names of each

player, the outcome for black (1=Win for Black, 0=Win for White) and the number of points in

each match.  The file looks like this…

==============================================================================

Date                       Black                     White    W/L        Match Length    Black_Rating      White_Rating

==============================================================================

1/1/2001              Frank                     John      1                              7              100                         100

1/2/2001              Bill                          Jeff        0                              5              100                         100

1/4/2001              Jeff                        Bill          1                              3              101                         99

1/5/2001              John                      Frank     0                              5              99                           101

1/6/2001              Bill                          John      0                              9              100                         98

Etc…..

I want to calculate a modified version of the Elo system and start everyone’s rating at 100.  

So Frank and John have never played so both rating would be 100 and the same for the next row containing Bill and

Jeff’s match.  For simplicity, let’s say if you win a match you add 1 to your score and subtract 1 for a loss; so, row 3

would read 101 and 99 respectively. 

My problem is I want SAS to recognize who the two opponents are, and put their current ratings in the corresponding

columns then based on the outcome of the match, calculate and remember the new rating. 

I’ve tried using RETAIN but my problem arises from the fact that any person can be in the black or white column.

Any ideas?

Thanks,

John


Accepted Solutions
Solution
‎08-01-2013 08:17 AM
Super Contributor
Posts: 339

Re: Retaining Values

Posted in reply to GoBrewers14

If your pool of player is reasonably small to divide into cases, you can use a _temporary_ array to retain the latest rating result of each player and update it & your data step variables at the same time.

Alternatively, to avoid having to know what array index refers to what player, you can use hash tables to do the same and simply search the hash by player name with the last value.

data want;

     if _n_=1 then do;

     declare hash h();

     h.defineKey('name');

     h.defineData('rating');

     h.defineDone;

     end;

     set have;

     rc=h.find(name: black);

     if rc NE 0 then do;

          name=black;

          rating=100;

          black_rating=rating;

          h.add();

     end; else do;

          black_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace();

     end;

     rc=h.find(name: white);

     if rc NE 0 then do;

          name=white

          rating=100;

          white_rating=rating;

          h.add();

     end; else do;

          white_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace()

     end;

     drop rc rating;

run;

This is untested but if you want to provide a small sample dataset through a datalines statement in the forums with the appropriate variable names that you use, I can edit to achieve your desired result. If you are not familiar with hash tables, they're just lookup tables. The find() method returns 0 if the key was found in the table and some error code other wise hence if rc NE 0 I'm initiating their rating to 100. Since you wanted the current record to list their rating prior to the match, the black_rating=rating; statement comes before the update to the rating variable.

Unless defined for multidata, hash tables can only retain one record per key. Thus, you must use .add method to add a new key/data pair to the table yet must use replace() method to update the data associated with an existing key.

Vincent

P.S. This should also be more efficient in terms of processing speed since the hash keeps everything in memory but since you only have a few hundred games it is somewhat irrelevant

View solution in original post


All Replies
Super User
Posts: 19,772

Re: Retaining Values

Posted in reply to GoBrewers14

Change your data structure to be one person per line, so one match is on two lines

Date                   Black                 WhiteW/L    Match LengthBlack_Rating  White_Rating

to

Date Player Colour Win Rating Length

Super Contributor
Posts: 644

Re: Retaining Values

Posted in reply to GoBrewers14

Transpose will not work in this context.

You would need to sort by player and date.  It helps to have a 1 / -1 Score instead of 1 / 0 Win/Loss.  Here is my suggestion (based on @Reeza);

Data Players;

    Length Player $32 ;

   Set Have ;

   Retain Rating 0 ;    /* Place holder */

   Keep Player Date Colour Score Match_Length Rating ;

   Player = White ;

    Colour = 'White' ;

    Score = 2 * W_L -1 ;

    Output ;

   Player = Black ;

   Colour = 'Black' ;

   Score = - Score ;

   Output ;

Run ;


Proc Sort

          Data = Players ;

    By   Player Date ;

Run ;


Data Players ;

    Set Players ;

    By  Player Date ;

    Prev_Rating = LAG (Rating) ;

    If  First.Player Then Rating = 100 ;

    Else Rating = SUM (Prev_Rating, Score) ;   /* Do not use LAG() in this statement */

   Drop Prev_Rating ;

Run ;


To get back to  the original format (assuming only one game played on each date, otherwise you will have to introduce a Game sequence column for multiple games per date)


Proc Sort Data = Players ;

          By Date Colour ;

Run ;

Data Want ;

     Set Have (Obs = 0) ;

     Retain Black White W_L Match_Length Black_Rating White_Rating ;

     Set Players ;

     Black = Player ;

     Black_Rating = Rating ;

     W_L = (Score + 1) / 2 ;

     Set Players ;

      White = Player ;

     White_ rating = Rating ;

     Output ;

Run ;

Alert - untested code ;

Richard

Super Contributor
Posts: 644

Re: Retaining Values

Posted in reply to RichardinOz

Alternately for the last step you could do a merge of the data on itself :

Proc SQL ;

     Create table Have as

          Select b.Date

               ,     b.player as Black_Player

               ,     w.player as White_Player

               ,     (b.Score + 1) /2 As W_L

               ,     b.Match_Length

               ,     b.rating as Black_Rating

               ,     w.Rating as White_Rating

          From   Players (where = (colour = 'Black')) b

               ,     Players (where = (Colour = 'White)) w

          Where  w.date = b.date

          Order by date

          ;

Quit ;

Whatever

Richard

Solution
‎08-01-2013 08:17 AM
Super Contributor
Posts: 339

Re: Retaining Values

Posted in reply to GoBrewers14

If your pool of player is reasonably small to divide into cases, you can use a _temporary_ array to retain the latest rating result of each player and update it & your data step variables at the same time.

Alternatively, to avoid having to know what array index refers to what player, you can use hash tables to do the same and simply search the hash by player name with the last value.

data want;

     if _n_=1 then do;

     declare hash h();

     h.defineKey('name');

     h.defineData('rating');

     h.defineDone;

     end;

     set have;

     rc=h.find(name: black);

     if rc NE 0 then do;

          name=black;

          rating=100;

          black_rating=rating;

          h.add();

     end; else do;

          black_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace();

     end;

     rc=h.find(name: white);

     if rc NE 0 then do;

          name=white

          rating=100;

          white_rating=rating;

          h.add();

     end; else do;

          white_rating=rating;

          if win_loss=1 then rating=rating+1;

          else rating=rating-1;

          h.replace()

     end;

     drop rc rating;

run;

This is untested but if you want to provide a small sample dataset through a datalines statement in the forums with the appropriate variable names that you use, I can edit to achieve your desired result. If you are not familiar with hash tables, they're just lookup tables. The find() method returns 0 if the key was found in the table and some error code other wise hence if rc NE 0 I'm initiating their rating to 100. Since you wanted the current record to list their rating prior to the match, the black_rating=rating; statement comes before the update to the rating variable.

Unless defined for multidata, hash tables can only retain one record per key. Thus, you must use .add method to add a new key/data pair to the table yet must use replace() method to update the data associated with an existing key.

Vincent

P.S. This should also be more efficient in terms of processing speed since the hash keeps everything in memory but since you only have a few hundred games it is somewhat irrelevant

Occasional Contributor
Posts: 9

Re: Retaining Values

Posted in reply to Vince28_Statcan

I read up on Hash tables.  Very interesting.  I modified your code slightly and it worked.  Thanks for your help.

-John

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 1545 views
  • 6 likes
  • 4 in conversation