## Group rows and creata a new column with values in comma

Solved
Frequent Contributor
Posts: 138

# Group rows and creata a new column with values in comma

Hello friends,

I want to ask a question please.

I have a data set which includes 4 rows with 2 columns.

X   Y

1   1

2   2

2   3

2   4

I want to create following data set from my data set

X     Y

1     1

2      2,3,4

Can anyone help how to do it?

Thanks

Ronein

Accepted Solutions
Solution
‎03-06-2018 04:19 AM
Valued Guide
Posts: 564

## Re: Group rows and creata a new column with values in comma

This is simple by-group-processing, explained in numerous papers & posts.

Untested, assuming that work.want is sorted by x:

```data work.want;
set work.have;
by x;

length Y_List \$ 100;
retain Y_List;

if first.x then Y_List = ' ';
Y_List = catx(',', Y_List, Y);
if last.x then output;

drop y;
run;```

All Replies
Solution
‎03-06-2018 04:19 AM
Valued Guide
Posts: 564

## Re: Group rows and creata a new column with values in comma

This is simple by-group-processing, explained in numerous papers & posts.

Untested, assuming that work.want is sorted by x:

```data work.want;
set work.have;
by x;

length Y_List \$ 100;
retain Y_List;

if first.x then Y_List = ' ';
Y_List = catx(',', Y_List, Y);
if last.x then output;

drop y;
run;```
Frequent Contributor
Posts: 138

## Re: Group rows and creata a new column with values in comma

Posted in reply to andreas_lds

It is a perfect and very clever solution.

May you please explain the code and steps of process.

I am trying to understand it but couldn't understand.

Thank you so much!

Super User
Posts: 13,508

## Re: Group rows and creata a new column with values in comma

```data work.want;
set work.have;
by x;  /* assumes data is sorted by this variable
automatic variables for variables on the
by statement are available to indicate the
first or last of a group represented by
the variable(s)
*/

length Y_List \$ 100; /* specify a length of the target variable
long enough to hold the longest expected
result
*/
retain Y_List;       /* keep the value of the variable between iterations
of the data step
*/

if first.x then Y_List = ' ';  /* set the target variable to empty at
the start of each group of X values
*/
Y_List = catx(',', Y_List, Y); /* catx function combines the previous version of the
Y_list variable with the current inserting a comma
note the comma only appears BETWEEN values, not after
the last
*/
if last.x then output;       /* the IF in this form only sends data to the output
set when the condition is true. The condition
here is that the current X value is the last
of this group.
*/

drop y;                      /* remove the current value of Y when the data is output*/
run;```
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
• 3 replies
• 119 views
• 1 like
• 3 in conversation