DATA Step, Macro, Functions and more

Group rows and creata a new column with values in comma

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 138
Accepted Solution

Group rows and creata a new column with values in comma

Hello friends,

I want to ask a question please.

I have a data set which includes 4 rows with 2 columns.

X   Y

1   1

2   2

2   3

2   4

I want to create following data set from my data set

X     Y

1     1

2      2,3,4

Can anyone help how to do it?

 

Thanks

Ronein

 

 


Accepted Solutions
Solution
‎03-06-2018 04:19 AM
Valued Guide
Posts: 564

Re: Group rows and creata a new column with values in comma

This is simple by-group-processing, explained in numerous papers & posts.

 

Untested, assuming that work.want is sorted by x:

data work.want;
   set work.have;
   by x;

   length Y_List $ 100;
   retain Y_List;

   if first.x then Y_List = ' ';
   Y_List = catx(',', Y_List, Y);
   if last.x then output; 

   drop y;
run;

View solution in original post


All Replies
Solution
‎03-06-2018 04:19 AM
Valued Guide
Posts: 564

Re: Group rows and creata a new column with values in comma

This is simple by-group-processing, explained in numerous papers & posts.

 

Untested, assuming that work.want is sorted by x:

data work.want;
   set work.have;
   by x;

   length Y_List $ 100;
   retain Y_List;

   if first.x then Y_List = ' ';
   Y_List = catx(',', Y_List, Y);
   if last.x then output; 

   drop y;
run;
Frequent Contributor
Posts: 138

Re: Group rows and creata a new column with values in comma

Posted in reply to andreas_lds

It is a perfect and very clever solution.

May you please explain the code and steps of process.

I am trying to understand it but couldn't understand.

Thank you so much!

 

Super User
Posts: 13,508

Re: Group rows and creata a new column with values in comma

data work.want;
   set work.have;
   by x;  /* assumes data is sorted by this variable
             automatic variables for variables on the 
             by statement are available to indicate the 
             first or last of a group represented by 
             the variable(s)
           */

   length Y_List $ 100; /* specify a length of the target variable
                           long enough to hold the longest expected
                           result
                        */
   retain Y_List;       /* keep the value of the variable between iterations
                           of the data step
                        */

   if first.x then Y_List = ' ';  /* set the target variable to empty at 
                                     the start of each group of X values
                                  */
   Y_List = catx(',', Y_List, Y); /* catx function combines the previous version of the
                                     Y_list variable with the current inserting a comma
                                     note the comma only appears BETWEEN values, not after
                                     the last
                                  */
   if last.x then output;       /* the IF in this form only sends data to the output 
                                   set when the condition is true. The condition
                                   here is that the current X value is the last 
                                   of this group.
                                 */

   drop y;                      /* remove the current value of Y when the data is output*/
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 119 views
  • 1 like
  • 3 in conversation