DATA Step, Macro, Functions and more

Help combining and totaling observations

Reply
Occasional Contributor
Posts: 6

Help combining and totaling observations

Hello guys,

I'm having an awful time trying to come up with how a particular code is to be written.

Below is the basis of what I have now:

View image: img1

When that code is run, it looks like this:

View image: img2

As you can see, basically it's a layout of all 2014 world cup games (The image is cropped short, there are actually 64 observations).

What I need to do within a DATA step is combine all distinct team observations (whether they be listed as Team1_Name or Team2_Name) and total their score in a new variable called 'GF'.

For example, in image two 'Spain' is listed in observations 3 and 33. In observation 3, Spain scored 1 goal, in observation 33, Spain scored 3 goals. If those were the only two observations in which Spain appeared, it's output would look like:

Team_Name, GF

Spain, 4

Here is an image I put together using data lines to try and help imagine it (the observations are limited and the goal numbers made up):

View image: img3

The code probably has to be some complicated combination of IF, THEN, DO, SELECT, WHERE... I have no idea, and I've been banging my head against the wall.

Any suggestions would be greatly appreciated.

Occasional Contributor
Posts: 6

Re: Help combining and totaling observations

Anyone? Even a suggestion to get the ball rolling would be helpful.

Contributor
Posts: 44

Re: Help combining and totaling observations

You didn't include data, so I just picked the result from A and B from wiki, but this should work with also with whole data:

attrib date informat=date9. format=date9.;

input Date  Team1_name $ 11-22 Team1_Score 24 Team2_Name $26-36 Team2_Score 38 Venue $ 40-55;

cards;

12Jun2014 Brazil       3 Croatia     1 Sao Paulo

13Jun2014 Mexico       1 Cameroon    0 Natal

17Jun2014 Brazil       0 Mexico         0 Fortaleza

18Jun2014 Cameroon     0 Croatia     4 Manaus

23Jun2014 Cameroon     1 Brazil      4 Brasilia

23Jun2014 Croatia      1 Mexico         3 Recife

13Jun2014 Spain        1 Netherlands 5 Salvador

13Jun2014 Chile        3 Australia   1 Cuiaba

18Jun2014 Australia    2 Netherlands 3 Porto Alegre

18Jun2014 Spain        0 Chile       2 Rio de Janeiro

23Jun2014 Australia    0 Spain       3 Curitiba

23Jun2014 Netherlands  2 Chile       0 Sao Paulo

;

proc sql;

  create table countries as

  select distinct Team1_name

  from WC2014A;

quit;

proc sql;

  select distinct Team1_name into :Country_list separated by ' '

  from WC2014A;

quit;

data _null_;

  set countries;

  call execute ('proc sql; create table '||Team1_name||' as select Team_name, Sum(GF) as GF from (select Team1_Name as Team_name, Team1_score as GF from WC2014A where Team1_name ="'||strip(Team1_name)||'" union select Team2_Name as Team_name, Team2_score as GF from WC2014A where Team2_name ="'||strip(Team1_name)||'") tmp group by Team_name;quit;');

run;

data W2014B;

  set &Country_list;

run;

PROC Star
Posts: 1,561

Re: Help combining and totaling observations

All in one data step:

data HAVE(index=(Team1_name Team2_name));

input Date $9 Team1_name $ 11-22 Team1_Score 24 Team2_Name $26-36 Team2_Score 38 Venue $ 40-55;

cards;

12Jun2014 Brazil 3 Croatia     1 Sao Paulo

13Jun2014 Mexico 1 Cameroon    0 Natal

17Jun2014 Brazil 0 Mexico      0 Fortaleza

18Jun2014 Cameroon 0 Croatia     4 Manaus

23Jun2014 Cameroon 1 Brazil      4 Brasilia

23Jun2014 Croatia 1 Mexico      3 Recife

13Jun2014 Spain 1 Netherlands 5 Salvador

13Jun2014 Chile        3 Australia   1 Cuiaba

18Jun2014 Australia 2 Netherlands 3 Porto Alegre

18Jun2014 Spain        0 Chile       2 Rio de Janeiro

23Jun2014 Australia 0 Spain       3 Curitiba

23Jun2014 Netherlands 2 Chile       0 Sao Paulo

run;

data ALL;

  set HAVE(keep= Team1_name Team1_Score rename=(Team1_name=TEAM Team1_Score=SCORE))

     HAVE(keep= Team2_name Team2_Score rename=(Team2_name=TEAM Team2_Score=SCORE)) ;

  by TEAM;

  if first.TEAM then TOTAL=0;

  TOTAL+SCORE;

  if last.TEAM then output;

  keep TEAM TOTAL;

run;

Ask a Question
Discussion stats
  • 3 replies
  • 247 views
  • 0 likes
  • 3 in conversation