Concatenating strings from multiple rows into single row in table

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Concatenating strings from multiple rows into single row in table

 

 I have the following dataset called Results_Before

 

Observation      Fruit ID        Fruit            Color             Shape

       1                     1             Banana          Yellow             Odd

       2                     2               Apple            Red              Circle_1

       3                     2               Apple           Green            Circle_2

       4                     3              Orange         Orange           Circle

       5                     4          Watermelon      Green              Oval

       6                     5               Berry            Blue               Odd_1

       7                     5               Berry            Red                Odd_2

       8                     5               Berry           Purple             Odd_3

       9                     6              Peach          Peach              Circle

 

How would you code to concatenate the columns using Data Step so that I get the output with 2 new columns below:

As you can see, when there are multiple same fruit numbers, the first color_joined and shape_joined remains the same but each row after adds the previous color and shape to color_joined and shape_joined.

 

Results_After

 

Observation    Fruit_ID        Fruit             Color             Shape           Color_Joined             Shape_Joined

       1                     1             Banana          Yellow             Odd                   Yellow                             Odd

       2                     2               Apple            Red              Circle_1                 Red                            Circle_1

       3                     2               Apple           Green            Circle_2            Red, Green               Circle_1, Circle_2

       4                     3              Orange         Orange           Circle                  Orange                          Circle

       5                     4          Watermelon      Green              Oval                   Green                             Oval

       6                     5               Berry            Blue               Odd_1                  Blue                             Odd_1

       7                     5               Berry            Red                Odd_2               Blue, Red                   Odd_1, Odd_2

       8                     5               Berry           Purple             Odd_3         Blue, Red, Purple      Odd_1, Odd_2, Odd_3

       9                     6              Peach          Peach              Circle                   Peach                           Circle

 

I tried only for Color to begin with, with the hope of also including Shape into the code.

The code I have so far doesn't work but here it is:

Also, not sure if Lag() function would help here?

 

Data Results_After;
     Set Results_Before;
           By Observation;
     If First.Fruit_ID then Color_Joined = Color;
Else Do; Color_Joined = catx(',', Color);
End; Run;
Proc Print;
Run;

 


Accepted Solutions
Solution
‎07-31-2017 02:47 PM
Frequent Contributor
Posts: 89

Re: Concatenating strings from multiple rows into single row in table

data fruit_info;
  length fruit $ 10;
  input observation fruit_id fruit $ color $ shape $;
  datalines;
       1                     1             Banana          Yellow             Odd
       2                     2               Apple            Red              Circle_1
       3                     2               Apple           Green            Circle_2
       4                     3              Orange         Orange           Circle
       5                     4          Watermelon      Green              Oval
       6                     5               Berry            Blue               Odd_1
       7                     5               Berry            Red                Odd_2
       8                     5               Berry           Purple             Odd_3
       9                     6              Peach          Peach              Circle
       ;
run;

proc print data=fruit_info noobs;
  title 'Input data set';
run;

proc sort data=fruit_info;
  by fruit_id;
run;

data fruit_info_retained;
  length color_joined shape_joined $ 50;
  retain color_joined shape_joined;
  set fruit_info;
  by fruit_id;
  if first.fruit_id then
    do;
      color_joined = color;
      shape_joined = shape;
    end;
  else 
    do;
      color_joined = catx(', ', color_joined, color);
      shape_joined = catx(', ', shape_joined, shape);
    end;
run;
  
proc sort data=fruit_info_retained;
  by observation;
run;

proc print data=fruit_info_retained noobs;
  title 'Output data set';
  var observation fruit_id fruit color shape color_joined shape_joined;
run;

View solution in original post


All Replies
Solution
‎07-31-2017 02:47 PM
Frequent Contributor
Posts: 89

Re: Concatenating strings from multiple rows into single row in table

data fruit_info;
  length fruit $ 10;
  input observation fruit_id fruit $ color $ shape $;
  datalines;
       1                     1             Banana          Yellow             Odd
       2                     2               Apple            Red              Circle_1
       3                     2               Apple           Green            Circle_2
       4                     3              Orange         Orange           Circle
       5                     4          Watermelon      Green              Oval
       6                     5               Berry            Blue               Odd_1
       7                     5               Berry            Red                Odd_2
       8                     5               Berry           Purple             Odd_3
       9                     6              Peach          Peach              Circle
       ;
run;

proc print data=fruit_info noobs;
  title 'Input data set';
run;

proc sort data=fruit_info;
  by fruit_id;
run;

data fruit_info_retained;
  length color_joined shape_joined $ 50;
  retain color_joined shape_joined;
  set fruit_info;
  by fruit_id;
  if first.fruit_id then
    do;
      color_joined = color;
      shape_joined = shape;
    end;
  else 
    do;
      color_joined = catx(', ', color_joined, color);
      shape_joined = catx(', ', shape_joined, shape);
    end;
run;
  
proc sort data=fruit_info_retained;
  by observation;
run;

proc print data=fruit_info_retained noobs;
  title 'Output data set';
  var observation fruit_id fruit color shape color_joined shape_joined;
run;
Super Contributor
Posts: 345

Re: Concatenating strings from multiple rows into single row in table

Just add a retain-statement after by.

retain Color_Joined Shape_Joined;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 123 views
  • 0 likes
  • 3 in conversation