BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mahler_ji
Obsidian | Level 7

Hey All,

 

I am looking for some help, working with a large set of sports data.  I have data that takes the following form:

 

Date     Team     Venue     Score

XXX       A           Home     80

XXX       B           Away       90

XXX       C          Home       78

XXX       D          Away         65

YYY       A          Home        100

YYY       D         Away          87

 

 

And I want to create a column or something that looks like the following:

 

Date     Team     Venue     Score       didWin

XXX       A           Home     80              0

XXX       B           Away       90             1

XXX       C          Home       78             1

XXX       D          Away         65            0

YYY       A          Home        100          1

YYY       D         Away          87            0

 

There are no variables currently in the dataset in which I could do a transpose so that I could compare the home score and the away score.  The games are always in pairs of home/away teams.  I just need a way to compare the scores for each pair of teams on a given date, and then assign either a 1 or 0 for a win during that game.

 

AS A BONUS:  I am also going to be tallying a total wins and total losses column at each point in the season, going back to the beginning of the season, so the data would look like this:

 

Date     Team     Venue     Score       didWin        totalWins       totalLosses

XXX       A           Home     80              0                  0                       0

XXX       B           Away       90             1                  0                       0

XXX       C          Home       78             1                  0                       0

XXX       D          Away         65            0                  0                      0

YYY       A          Home        100          1                  0                       1 

YYY       D         Away          87            0                  0                       1

 

 

So coming into date YYY, I would know that the two teams that are playing each other both have zero wins and one loss up to that point in the season!

 

I attached a sample dataset to work with!

 

 

Thanks! 

1 ACCEPTED SOLUTION

Accepted Solutions
ToddB
SAS Employee

This is one way to do it but requires going through the data three times. I am sure someone has a more efficient solution.

/* create a gamenum variable as key for each game */
data scores(keep=date teams venue f gamenum);
 set [library].teamdatasample;
 /* games are in pairs */
 gamenum=ceil(_n_/2);
 label f = "Final Score";
run;

/* sort scores by gamenum and final score descending to assign winner and loser */
proc sort data=scores;
 by gamenum descending f;
run;

/* assign a winner or loser */
data scores;
 set scores;
 by gamenum descending f;
 if first.gamenum then
 	didwin = 1;
 else
 	didwin = 0;
run;

/* now sort scores data by team and date to get win lose record over time */
proc sort data=scores;
 by teams date;
run;

/* retain wins and losses */
data scores;
 set scores;
 by teams date;
 retain totalwin totalloss 0;
 if first.teams then
 do;
 	totalwin = 0;
	totalloss = 0;
 end;
 if didwin = 1 then
	totalwin = totalwin + 1;
 else
	totalloss = totalloss + 1;
run;
 
/* resort by game number to order as original data */
proc sort data=scores;
 by gamenum descending didwin;
run;

View solution in original post

4 REPLIES 4
Reeza
Super User
Zip the file before uploading.
mahler_ji
Obsidian | Level 7

Awesome!

 

Post updated with data sample.

ToddB
SAS Employee

This is one way to do it but requires going through the data three times. I am sure someone has a more efficient solution.

/* create a gamenum variable as key for each game */
data scores(keep=date teams venue f gamenum);
 set [library].teamdatasample;
 /* games are in pairs */
 gamenum=ceil(_n_/2);
 label f = "Final Score";
run;

/* sort scores by gamenum and final score descending to assign winner and loser */
proc sort data=scores;
 by gamenum descending f;
run;

/* assign a winner or loser */
data scores;
 set scores;
 by gamenum descending f;
 if first.gamenum then
 	didwin = 1;
 else
 	didwin = 0;
run;

/* now sort scores data by team and date to get win lose record over time */
proc sort data=scores;
 by teams date;
run;

/* retain wins and losses */
data scores;
 set scores;
 by teams date;
 retain totalwin totalloss 0;
 if first.teams then
 do;
 	totalwin = 0;
	totalloss = 0;
 end;
 if didwin = 1 then
	totalwin = totalwin + 1;
 else
	totalloss = totalloss + 1;
run;
 
/* resort by game number to order as original data */
proc sort data=scores;
 by gamenum descending didwin;
run;
mahler_ji
Obsidian | Level 7

This looks really great! I was actually working on doing it in a similar way!  Thanks for finishing the thoughts for me!


John

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 2209 views
  • 0 likes
  • 3 in conversation