BookmarkSubscribeRSS Feed
Petergao1
Fluorite | Level 6
Dear SAS users,
Below is the sample dataset and a program that will produce a one line output that include team name and color and each team's multiple game result. It works well.

My question is suppose there are multiple entities (show up as a new id column such as schools id) and each entity will have the same set of groups, this is normal because schools may have the same set of groups and they will come to compete in some activities and got various result at different round of contests.

The new dataset (call it team_new, the new column is sch_id) will be like this:
sch_id color group game1 game2 game 3
sch1 Green Crickets 10 7 8 sch1 Blue Sea Otters 10 6 7 sch1 Yellow Stingers 9 10 9 sch1 Red Hot Ants 8 9 9 sch1 Purple Cats 9 9 9 sch2 Green Crickets 10 9 7 sch2 Blue Sea Otters 8 7 9 sch2 Yellow Stingers 7 8 10 sch2 Red Hot Ants 9 8 10 sch2 Purple Cats 8 6 9
sch3 Green Crickets 5 7 9 sch3 Blue Sea Otters 6 8 10 sch3 Yellow Stingers 7 9 8 sch3 Red Hot Ants 6 9 10 sch3 Purple Cats 8 10 9
;

The desired output will be like this:

sch_id greencricketstotal blueseaottertotal yellowstingerstotal redhotantstotl purplecatstotal
sch1 25 23 28 26 27
sch2 xx xx xx xx xx
sch3 xx xx xx xx xx
*note above XX will be the sum of three competition's result, not run yet.

data teams; *This is the original sample dataset's name; input color $15. @16 team_name $15. @32 game1 game2 game3; datalines; Green Crickets 10 7 8 Blue Sea Otters 10 6 7 Yellow Stingers 9 10 9 Red Hot Ants 8 9 9 Purple Cats 9 9 9 ; %macro newvars(dsn); data _null_; set &dsn end=end; count+1; call symputx('macvar'||left(count),compress(color)||compress(team_name)||"Total"); if end then call symputx('max',count); run; data teamscores; set &dsn end=end; %do i = 1 %to &max; if _n_=&i then do; &&macvar&i=sum(of game1-game3); retain &&macvar&i; keep &&macvar&i; end; %end; if end then output; %mend newvars; %newvars(teams) proc print noobs; title "League Team Game Totals"; run;

 The above program runs well. However, when I created the new dataset team_new with one more new column sch_id, and I tried to modify the above program, I can have it run, but only produce one line of output, which is the last color+group's record. I know I need to have more codes, but just cannot get it out the way I need. Any idea what I should do? By the way I would like to use the macro, with some modification, and do not want to do a transpose step. Thank you.

7 REPLIES 7
ballardw
Super User

What is the purpose of the output data set? I does not appear to be well structured for most analysis tasks.

 

If the use of the data is just to make a report that people read then you likely want a report procedure.

Maybe something like:

data teams; 
   input sch_id $ color $15. @16 team_name $15. @32 game1 game2 game3; 
   gametotal= sum(game1, game2, game3);
datalines;
sch1 Green          Crickets        10 7 8
sch1 Blue           Sea Otters      10 6 7
sch1 Yellow         Stingers        9 10 9
sch1 Red            Hot Ants        8 9 9
sch1 Purple         Cats            9 9 9
sch2 Green          Crickets        10 9 7
sch2 Blue           Sea Otters      8 7 9
sch2 Yellow         Stingers        7 8 10
sch2 Red            Hot Ants        9 8 10
sch2 Purple         Cats            8 6 9
sch3 Green          Crickets        5 7 9
sch3 Blue           Sea Otters      6 8 10
sch3 Yellow         Stingers        7 9 8
sch3 Red            Hot Ants        6 9 10
sch3 Purple         Cats            8 10 9
;


proc tabulate data=teams;
   class sch_id color team_name;
   var gametotal;
   table sch_id=' '*gametotal=' '* sum=' ',
         color=' '*team_name=' '
         / row=float
   ;
run;

          

And you original data would be better off with one record per game plus a variable to identify which game the score reflects. Then a report can be written that automatically adjusts to the number of games.

Consider:

data newteams;
   input sch_id $ color :$15. @16 team_name $15. @32 gamenumber score; 
datalines;
sch1   Green   Crickets        1 10 
sch1   Green   Crickets        2 7  
sch1   Green   Crickets        3 8  
sch1   Green   Crickets        4 6  
sch1   Blue    Sea Otters      1 10 
sch1   Blue    Sea Otters      2 6  
sch1   Blue    Sea Otters      3 7  
sch1   Blue    Sea Otters      4 3  
sch1   Yellow  Stingers        1 9  
sch1   Yellow  Stingers        2 10 
sch1   Yellow  Stingers        3 9  
sch1   Yellow  Stingers        4 7  
sch1   Red     Hot Ants        1 8  
sch1   Red     Hot Ants        2 9  
sch1   Red     Hot Ants        3 9  
sch1   Purple  Cats            1 9  
sch1   Purple  Cats            2 9  
sch1   Purple  Cats            3 9  
sch2   Green   Crickets        1 10 
sch2   Green   Crickets        2 9  
sch2   Green   Crickets        3 7  
sch2   Blue    Sea Otters      1 8  
sch2   Blue    Sea Otters      2 7  
sch2   Blue    Sea Otters      3 9  
sch2   Yellow  Stingers        1 7  
sch2   Yellow  Stingers        2 8  
sch2   Yellow  Stingers        3 10 
sch2   Red     Hot Ants        1 9  
sch2   Red     Hot Ants        2 8  
sch2   Red     Hot Ants        3 10 
sch2   Purple  Cats            1 8  
sch2   Purple  Cats            2 6  
sch2   Purple  Cats            3 9  
sch3   Green   Crickets        1 5  
sch3   Green   Crickets        2 7  
sch3   Green   Crickets        3 9  
sch3   Blue    Sea Otters      1 6  
sch3   Blue    Sea Otters      2 8  
sch3   Blue    Sea Otters      3 10 
sch3   Yellow  Stingers        1 7  
sch3   Yellow  Stingers        2 9  
sch3   Yellow  Stingers        3 8  
sch3   Red     Hot Ants        1 6  
sch3   Red     Hot Ants        2 9  
sch3   Red     Hot Ants        3 10 
sch3   Purple  Cats            1 8  
sch3   Purple  Cats            2 10 
sch3   Purple  Cats            3 9  

 


;


proc tabulate data=newteams;
   class sch_id color team_name;
   var score;
   table sch_id=' '*score=' '* sum=' ',
         color=' '*team_name=' '
         / row=float
   ;
run;

If you look at the data you will see that several of the teams have different numbers of games but we still get an overall total without having to do anything different for different numbers of scores. And a minor change you can get different measures

proc tabulate data=newteams;
   class sch_id color team_name;
   var score;
   table sch_id=' '*score=' '* (sum='Total score ' mean='Average score'),
         color=' '*team_name=' '
         / row=float
   ;
run;
Petergao1
Fluorite | Level 6

Hi, Ballardw,

 

The purpose of this request is to enhance the efficiency.

 

I have datasets similar to this structure, that is sch_id, subgroup (has 10 rows) and annual performance data (say PI1, PI2, PI3).

 

I m looking for a solution to quickly process the data to have each school has its data of group_year_PI columns by each school_ID. The Macro should save me a lot of other data steps.

 

Hope this help.

mklangley
Lapis Lazuli | Level 10

Since you said, "I would like to use the macro, with some modification, and do not want to do a transpose step," here is a solution that just requires a small edit to your macro. Instead of just outputting at the END of the dataset, it will output at the end of each distinct sch_id group.

%macro newvars(dsn); 
    data _null_;
        set &dsn end=end;
        count+1;
        call symputx('macvar'||left(count),compress(color)||compress(team_name)||"Total");
        if end then call symputx('max',count);
    run;

    data teamscores;
        set &dsn;
        by sch_id;
        %do i = 1 %to &max;
            if _n_=&i then do;
                &&macvar&i=sum(of game1-game3);
                retain &&macvar&i;
                keep sch_id &&macvar&i;
            end;
        %end;
        if last.sch_id then output;
    run;
%mend newvars;
%newvars(teams)
Petergao1
Fluorite | Level 6

HI mklangley,

Thank you for taking your time reading my questions and making out a test program. I run your program. It worked. However, it still produce just one line, which is the very last record's sch_id, instead of three lines for three sch_id. (previously I did some trials and always have a output of one line of the last record in the input data set. 

 

I was wondering it has to do with the sch_id column, somehow, I cannot get it done properly.

 

Thank you. I will keep trying in finding a way out...

smantha
Lapis Lazuli | Level 10
data teams; 
   input sch_id $ color $15. @16 team_name $15. @32 game1 game2 game3; 
   gametotal= sum(game1, game2, game3);
datalines;
sch1 Green          Crickets        10 7 8
sch1 Blue           Sea Otters      10 6 7
sch1 Yellow         Stingers        9 10 9
sch1 Red            Hot Ants        8 9 9
sch1 Purple         Cats            9 9 9
sch2 Green          Crickets        10 9 7
sch2 Blue           Sea Otters      8 7 9
sch2 Yellow         Stingers        7 8 10
sch2 Red            Hot Ants        9 8 10
sch2 Purple         Cats            8 6 9
sch3 Green          Crickets        5 7 9
sch3 Blue           Sea Otters      6 8 10
sch3 Yellow         Stingers        7 9 8
sch3 Red            Hot Ants        6 9 10
sch3 Purple         Cats            8 10 9
;
run;

proc means data=teams nway missing noprint;
class sch_id color team_name;
var game:;
output out=sum1 sum=;
run;

data sum1;
length colorteam $32.;
set sum1;
colorteam = cats(strip(compress(Color)),strip(compress(team_name)));
run;

proc transpose data=sum1 out=sum2;
by sch_id;
id colorteam;
var gametotal;
run;

Check if this works for you

mklangley
Lapis Lazuli | Level 10

@Petergao1 Actually, if you run the code below, you should see it produce three rows in the output dataset--one for each sch_id. Are you getting a different result? If so, please post the full code/log that you're are running.

data teams;
    input sch_id $ color $ team_name & $10. game1 game2 game3;
    datalines;
sch1 Green          Crickets        10 7 8
sch1 Blue           Sea Otters      10 6 7
sch1 Yellow         Stingers        9 10 9
sch1 Red            Hot Ants        8 9 9
sch1 Purple         Cats            9 9 9
sch2 Green          Crickets        10 9 7
sch2 Blue           Sea Otters      8 7 9
sch2 Yellow         Stingers        7 8 10
sch2 Red            Hot Ants        9 8 10
sch2 Purple         Cats            8 6 9
sch3 Green          Crickets        5 7 9
sch3 Blue           Sea Otters      6 8 10
sch3 Yellow         Stingers        7 9 8
sch3 Red            Hot Ants        6 9 10
sch3 Purple         Cats            8 10 9
;
run;

%macro newvars(dsn); %macro _; %mend _;
    data _null_;
        set &dsn end=end;
        count+1;
        call symputx('macvar'||left(count),compress(color)||compress(team_name)||"Total");
        if end then call symputx('max',count);
    run;

    data teamscores;
        set &dsn;
        by sch_id;
        %do i = 1 %to &max;
            if _n_=&i then do;
                &&macvar&i=sum(of game1-game3);
                retain &&macvar&i;
                keep sch_id &&macvar&i;
            end;
        %end;
        if last.sch_id then output;
    run;
%mend newvars;
%newvars(teams);

proc print; run;

My output:

mklangley_0-1593464380361.png

 

Petergao1
Fluorite | Level 6

Dear mklangley,

 

Thanks for your continued attention and help. Today I pasted your codes into my SAS EG, it worked! I got output of four lines, each row is a unique sch_id. The new columns are group_color combined for the sum, This is exactly what I am looking for.

Thank you so much.

 

Next I will modify my codes according to my real world data structure and apply them to my dataset to see if I can get them out as desired.

 

Thank you everyone who read and gave your help in many ways.

Best wishes, see you around.

Peter

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 882 views
  • 1 like
  • 4 in conversation