BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jos283
Fluorite | Level 6

 

 I have the following dataset called Results_Before

 

Observation      Fruit ID        Fruit            Color             Shape

       1                     1             Banana          Yellow             Odd

       2                     2               Apple            Red              Circle_1

       3                     2               Apple           Green            Circle_2

       4                     3              Orange         Orange           Circle

       5                     4          Watermelon      Green              Oval

       6                     5               Berry            Blue               Odd_1

       7                     5               Berry            Red                Odd_2

       8                     5               Berry           Purple             Odd_3

       9                     6              Peach          Peach              Circle

 

How would you code to concatenate the columns using Data Step so that I get the output with 2 new columns below:

As you can see, when there are multiple same fruit numbers, the first color_joined and shape_joined remains the same but each row after adds the previous color and shape to color_joined and shape_joined.

 

Results_After

 

Observation    Fruit_ID        Fruit             Color             Shape           Color_Joined             Shape_Joined

       1                     1             Banana          Yellow             Odd                   Yellow                             Odd

       2                     2               Apple            Red              Circle_1                 Red                            Circle_1

       3                     2               Apple           Green            Circle_2            Red, Green               Circle_1, Circle_2

       4                     3              Orange         Orange           Circle                  Orange                          Circle

       5                     4          Watermelon      Green              Oval                   Green                             Oval

       6                     5               Berry            Blue               Odd_1                  Blue                             Odd_1

       7                     5               Berry            Red                Odd_2               Blue, Red                   Odd_1, Odd_2

       8                     5               Berry           Purple             Odd_3         Blue, Red, Purple      Odd_1, Odd_2, Odd_3

       9                     6              Peach          Peach              Circle                   Peach                           Circle

 

I tried only for Color to begin with, with the hope of also including Shape into the code.

The code I have so far doesn't work but here it is:

Also, not sure if Lag() function would help here?

 

Data Results_After;
     Set Results_Before;
           By Observation;
     If First.Fruit_ID then Color_Joined = Color;
Else Do; Color_Joined = catx(',', Color);
End; Run;
Proc Print;
Run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
SuzanneDorinski
Lapis Lazuli | Level 10
data fruit_info;
  length fruit $ 10;
  input observation fruit_id fruit $ color $ shape $;
  datalines;
       1                     1             Banana          Yellow             Odd
       2                     2               Apple            Red              Circle_1
       3                     2               Apple           Green            Circle_2
       4                     3              Orange         Orange           Circle
       5                     4          Watermelon      Green              Oval
       6                     5               Berry            Blue               Odd_1
       7                     5               Berry            Red                Odd_2
       8                     5               Berry           Purple             Odd_3
       9                     6              Peach          Peach              Circle
       ;
run;

proc print data=fruit_info noobs;
  title 'Input data set';
run;

proc sort data=fruit_info;
  by fruit_id;
run;

data fruit_info_retained;
  length color_joined shape_joined $ 50;
  retain color_joined shape_joined;
  set fruit_info;
  by fruit_id;
  if first.fruit_id then
    do;
      color_joined = color;
      shape_joined = shape;
    end;
  else 
    do;
      color_joined = catx(', ', color_joined, color);
      shape_joined = catx(', ', shape_joined, shape);
    end;
run;
  
proc sort data=fruit_info_retained;
  by observation;
run;

proc print data=fruit_info_retained noobs;
  title 'Output data set';
  var observation fruit_id fruit color shape color_joined shape_joined;
run;

View solution in original post

4 REPLIES 4
SuzanneDorinski
Lapis Lazuli | Level 10
data fruit_info;
  length fruit $ 10;
  input observation fruit_id fruit $ color $ shape $;
  datalines;
       1                     1             Banana          Yellow             Odd
       2                     2               Apple            Red              Circle_1
       3                     2               Apple           Green            Circle_2
       4                     3              Orange         Orange           Circle
       5                     4          Watermelon      Green              Oval
       6                     5               Berry            Blue               Odd_1
       7                     5               Berry            Red                Odd_2
       8                     5               Berry           Purple             Odd_3
       9                     6              Peach          Peach              Circle
       ;
run;

proc print data=fruit_info noobs;
  title 'Input data set';
run;

proc sort data=fruit_info;
  by fruit_id;
run;

data fruit_info_retained;
  length color_joined shape_joined $ 50;
  retain color_joined shape_joined;
  set fruit_info;
  by fruit_id;
  if first.fruit_id then
    do;
      color_joined = color;
      shape_joined = shape;
    end;
  else 
    do;
      color_joined = catx(', ', color_joined, color);
      shape_joined = catx(', ', shape_joined, shape);
    end;
run;
  
proc sort data=fruit_info_retained;
  by observation;
run;

proc print data=fruit_info_retained noobs;
  title 'Output data set';
  var observation fruit_id fruit color shape color_joined shape_joined;
run;
benhaz
Calcite | Level 5

Hi,

 

I am trying to concatenate all the rows from each group. I have written a code from which I am getting my desire result in last line(using last.) but problem is , it is only keeping up to 32767 characters. I am looking for a solution, if the character limit reaches 32767 then in the next variable (column) the remaining character should add. I am using the below code but it is not spiting into different variable if limit reaches. I have attached the result. Any help will be appreciated. Many thanks in advance.

 

data part4 (keep=DOC_NUMBER original_variable count);
set part3;
BY DOC_NUMBER;
if FIRST.DOC_NUMBER then
Count = 0;
Count + 1;
run;

 

data part5;
length concatenated_field $ 32767;
retain concatenated_field;
set part4;
by DOC_NUMBER;
if first.DOC_NUMBER then
do;
concatenated_field = original_variable;
end;
else
do;
concatenated_field = catx(', ', concatenated_field, original_variable);
end;
run;

andreas_lds
Jade | Level 19

Just add a retain-statement after by.

retain Color_Joined Shape_Joined;
Ronein
Meteorite | Level 14

Hello

Should retain statement be after by or before????

 

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 5980 views
  • 3 likes
  • 5 in conversation