BookmarkSubscribeRSS Feed
mysasusername
Calcite | Level 5

*EDIT* I probably should have added that I have to use a data step to complete this. 

So I have a problem where I have the data of Rose Bowl football games including the winning college and the margin of victory (mov). I have to get a table with proc print that produces three columns: The college, how many times they've won, and their average mov for all of the winning games. 

I can get my data sorted into groups by the school with their total number of wins, but I don't know how to get the average of their margins of victory without it being a running total of all schools rather than school by school. This is my current code that ends up with the avgmov value being a running total of all schools, rather than the school by school which I need. 

 

proc sort data=rosebowl;
by winteam mov;
run;

data WinsBySchool;
set rosebowl;
count + 1;
by winteam;
retain totmov 0;
if first.winteam then count = 1;
totmov+mov;
if Last.winteam=1;
avgmov=totmov/count;
run;

3 REPLIES 3
ballardw
Super User

Sounds more like a task for Proc MEANS

 

maybe

 

Proc means data = rosebowl ;

   by winteam;

   var MOV;

   output out = winsbyschool n= mean= /autoname ;

run;

PaigeMiller
Diamond | Level 26

This would be a whole lot easier in PROC SUMMARY/PROC MEANS, since PROC SUMMARY/PROC MEANS has built-in the code to compute statistics within BY groups.

 

Something like this (UNTESTED CODE)

 

proc summary data=rosebowl nway;
    class winteam;
    var mov;
    output out=_stats_ mean=avgmov n=count;
run;

I add that computing means in a DATA step is usually inefficient, and takes longer to program that using PROC SUMMARY/PROC MEANS. Also, the way you are doing things by computing numerators which are sums, and then dividing by the count, might work fine for Rose Bowl data, but in the real world, if you have missing values, then your method gives the wrong answers, while PROC SUMMARY/PROC MEANS gives the right answers. So I recommend that you don't compute means (or other widely available statistics) by writing your own DATA step code.

--
Paige Miller
Reeza
Super User
proc sort data=rosebowl;
    by winteam mov;
run;

data WinsBySchool;
     set rosebowl;
      by winteam;
       retain totmov4Wins 0 numWins;

*reset everything at the beginning of each team;
if first.winteam then do;
numWins=0;
totmov4wins=0;
end;
*if they win increment values;
if win=1 then do:
totmov+mov;
numWins+1;
end;

*if last record for team produce count and average;
if last.winteam then do;
         avgmov=totmov/count;
         output;
end;

*keep only variables of interest;
keep winTeam numWIns avgMov;
run;

I don't see any logic about checking if a team has won? I added that in, but it obviously depends on how your data is structured which I have no idea of. 

I would highly recommend you add comments to your code.


@mysasusername wrote:

*EDIT* I probably should have added that I have to use a data step to complete this. 

So I have a problem where I have the data of Rose Bowl football games including the winning college and the margin of victory (mov). I have to get a table with proc print that produces three columns: The college, how many times they've won, and their average mov for all of the winning games. 

I can get my data sorted into groups by the school with their total number of wins, but I don't know how to get the average of their margins of victory without it being a running total of all schools rather than school by school. This is my current code that ends up with the avgmov value being a running total of all schools, rather than the school by school which I need. 

 

proc sort data=rosebowl;
by winteam mov;
run;

data WinsBySchool;
set rosebowl;
count + 1;
by winteam;
retain totmov 0;
if first.winteam then count = 1;
totmov+mov;
if Last.winteam=1;
avgmov=totmov/count;
run;


 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 520 views
  • 0 likes
  • 4 in conversation