DATA Step, Macro, Functions and more

Does Anyone Like Sports?!

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 101
Accepted Solution

Does Anyone Like Sports?!

[ Edited ]

Hey All,

 

I am looking for some help, working with a large set of sports data.  I have data that takes the following form:

 

Date     Team     Venue     Score

XXX       A           Home     80

XXX       B           Away       90

XXX       C          Home       78

XXX       D          Away         65

YYY       A          Home        100

YYY       D         Away          87

 

 

And I want to create a column or something that looks like the following:

 

Date     Team     Venue     Score       didWin

XXX       A           Home     80              0

XXX       B           Away       90             1

XXX       C          Home       78             1

XXX       D          Away         65            0

YYY       A          Home        100          1

YYY       D         Away          87            0

 

There are no variables currently in the dataset in which I could do a transpose so that I could compare the home score and the away score.  The games are always in pairs of home/away teams.  I just need a way to compare the scores for each pair of teams on a given date, and then assign either a 1 or 0 for a win during that game.

 

AS A BONUS:  I am also going to be tallying a total wins and total losses column at each point in the season, going back to the beginning of the season, so the data would look like this:

 

Date     Team     Venue     Score       didWin        totalWins       totalLosses

XXX       A           Home     80              0                  0                       0

XXX       B           Away       90             1                  0                       0

XXX       C          Home       78             1                  0                       0

XXX       D          Away         65            0                  0                      0

YYY       A          Home        100          1                  0                       1 

YYY       D         Away          87            0                  0                       1

 

 

So coming into date YYY, I would know that the two teams that are playing each other both have zero wins and one loss up to that point in the season!

 

I attached a sample dataset to work with!

 

 

Thanks! 

Attachment

Accepted Solutions
Solution
‎09-25-2015 06:23 AM
SAS Employee
Posts: 7

Re: Does Anyone Like Sports?!

Posted in reply to mahler_ji

This is one way to do it but requires going through the data three times. I am sure someone has a more efficient solution.

/* create a gamenum variable as key for each game */
data scores(keep=date teams venue f gamenum);
 set [library].teamdatasample;
 /* games are in pairs */
 gamenum=ceil(_n_/2);
 label f = "Final Score";
run;

/* sort scores by gamenum and final score descending to assign winner and loser */
proc sort data=scores;
 by gamenum descending f;
run;

/* assign a winner or loser */
data scores;
 set scores;
 by gamenum descending f;
 if first.gamenum then
 	didwin = 1;
 else
 	didwin = 0;
run;

/* now sort scores data by team and date to get win lose record over time */
proc sort data=scores;
 by teams date;
run;

/* retain wins and losses */
data scores;
 set scores;
 by teams date;
 retain totalwin totalloss 0;
 if first.teams then
 do;
 	totalwin = 0;
	totalloss = 0;
 end;
 if didwin = 1 then
	totalwin = totalwin + 1;
 else
	totalloss = totalloss + 1;
run;
 
/* resort by game number to order as original data */
proc sort data=scores;
 by gamenum descending didwin;
run;

View solution in original post


All Replies
Super User
Posts: 19,873

Re: Does Anyone Like Sports?!

Posted in reply to mahler_ji
Zip the file before uploading.
Frequent Contributor
Posts: 101

Re: Does Anyone Like Sports?!

Awesome!

 

Post updated with data sample.

Solution
‎09-25-2015 06:23 AM
SAS Employee
Posts: 7

Re: Does Anyone Like Sports?!

Posted in reply to mahler_ji

This is one way to do it but requires going through the data three times. I am sure someone has a more efficient solution.

/* create a gamenum variable as key for each game */
data scores(keep=date teams venue f gamenum);
 set [library].teamdatasample;
 /* games are in pairs */
 gamenum=ceil(_n_/2);
 label f = "Final Score";
run;

/* sort scores by gamenum and final score descending to assign winner and loser */
proc sort data=scores;
 by gamenum descending f;
run;

/* assign a winner or loser */
data scores;
 set scores;
 by gamenum descending f;
 if first.gamenum then
 	didwin = 1;
 else
 	didwin = 0;
run;

/* now sort scores data by team and date to get win lose record over time */
proc sort data=scores;
 by teams date;
run;

/* retain wins and losses */
data scores;
 set scores;
 by teams date;
 retain totalwin totalloss 0;
 if first.teams then
 do;
 	totalwin = 0;
	totalloss = 0;
 end;
 if didwin = 1 then
	totalwin = totalwin + 1;
 else
	totalloss = totalloss + 1;
run;
 
/* resort by game number to order as original data */
proc sort data=scores;
 by gamenum descending didwin;
run;
Frequent Contributor
Posts: 101

Re: Does Anyone Like Sports?!

This looks really great! I was actually working on doing it in a similar way!  Thanks for finishing the thoughts for me!


John

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 306 views
  • 0 likes
  • 3 in conversation