Hello guys,
I'm having an awful time trying to come up with how a particular code is to be written.
Below is the basis of what I have now:
When that code is run, it looks like this:
As you can see, basically it's a layout of all 2014 world cup games (The image is cropped short, there are actually 64 observations).
What I need to do within a DATA step is combine all distinct team observations (whether they be listed as Team1_Name or Team2_Name) and total their score in a new variable called 'GF'.
For example, in image two 'Spain' is listed in observations 3 and 33. In observation 3, Spain scored 1 goal, in observation 33, Spain scored 3 goals. If those were the only two observations in which Spain appeared, it's output would look like:
Team_Name, GF
Spain, 4
Here is an image I put together using data lines to try and help imagine it (the observations are limited and the goal numbers made up):
The code probably has to be some complicated combination of IF, THEN, DO, SELECT, WHERE... I have no idea, and I've been banging my head against the wall.
Any suggestions would be greatly appreciated.
Anyone? Even a suggestion to get the ball rolling would be helpful.
You didn't include data, so I just picked the result from A and B from wiki, but this should work with also with whole data:
attrib date informat=date9. format=date9.;
input Date Team1_name $ 11-22 Team1_Score 24 Team2_Name $26-36 Team2_Score 38 Venue $ 40-55;
cards;
12Jun2014 Brazil 3 Croatia 1 Sao Paulo
13Jun2014 Mexico 1 Cameroon 0 Natal
17Jun2014 Brazil 0 Mexico 0 Fortaleza
18Jun2014 Cameroon 0 Croatia 4 Manaus
23Jun2014 Cameroon 1 Brazil 4 Brasilia
23Jun2014 Croatia 1 Mexico 3 Recife
13Jun2014 Spain 1 Netherlands 5 Salvador
13Jun2014 Chile 3 Australia 1 Cuiaba
18Jun2014 Australia 2 Netherlands 3 Porto Alegre
18Jun2014 Spain 0 Chile 2 Rio de Janeiro
23Jun2014 Australia 0 Spain 3 Curitiba
23Jun2014 Netherlands 2 Chile 0 Sao Paulo
;
proc sql;
create table countries as
select distinct Team1_name
from WC2014A;
quit;
proc sql;
select distinct Team1_name into :Country_list separated by ' '
from WC2014A;
quit;
data _null_;
set countries;
call execute ('proc sql; create table '||Team1_name||' as select Team_name, Sum(GF) as GF from (select Team1_Name as Team_name, Team1_score as GF from WC2014A where Team1_name ="'||strip(Team1_name)||'" union select Team2_Name as Team_name, Team2_score as GF from WC2014A where Team2_name ="'||strip(Team1_name)||'") tmp group by Team_name;quit;');
run;
data W2014B;
set &Country_list;
run;
All in one data step:
data HAVE(index=(Team1_name Team2_name));
input Date $9 Team1_name $ 11-22 Team1_Score 24 Team2_Name $26-36 Team2_Score 38 Venue $ 40-55;
cards;
12Jun2014 Brazil 3 Croatia 1 Sao Paulo
13Jun2014 Mexico 1 Cameroon 0 Natal
17Jun2014 Brazil 0 Mexico 0 Fortaleza
18Jun2014 Cameroon 0 Croatia 4 Manaus
23Jun2014 Cameroon 1 Brazil 4 Brasilia
23Jun2014 Croatia 1 Mexico 3 Recife
13Jun2014 Spain 1 Netherlands 5 Salvador
13Jun2014 Chile 3 Australia 1 Cuiaba
18Jun2014 Australia 2 Netherlands 3 Porto Alegre
18Jun2014 Spain 0 Chile 2 Rio de Janeiro
23Jun2014 Australia 0 Spain 3 Curitiba
23Jun2014 Netherlands 2 Chile 0 Sao Paulo
run;
data ALL;
set HAVE(keep= Team1_name Team1_Score rename=(Team1_name=TEAM Team1_Score=SCORE))
HAVE(keep= Team2_name Team2_Score rename=(Team2_name=TEAM Team2_Score=SCORE)) ;
by TEAM;
if first.TEAM then TOTAL=0;
TOTAL+SCORE;
if last.TEAM then output;
keep TEAM TOTAL;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.