- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Good morning.
I am attempting to concatenate rows of data from multiple line onto one line. I can't see to get this work. I've deconstructed it to see if I can tell what's going on but I can't see to locate the error. What I end up with is a single line per ID_NUMBER with only the first comment. I don't get the subsequent concatenations for the same ID_NUMBER.
I'm sure there must be a better way to do this but I'm trying to understand how such a data step works.
HAVE | |||
ID_NUMBER | COMMENT_NUMBER | COMMENT_SEQUENCE_NUMBER | INCOMING_COMMENT |
1 | 10 | 1 | This person has four lines |
1 | 10 | 2 | on their comment. I would like all four |
1 | 10 | 3 | on the same line. |
1 | 10 | 4 | This is the final line for this comment. |
1 | 15 | 1 | Jack and Jill went up the |
1 | 15 | 2 | hill to fetch a pail |
1 | 15 | 3 | of water. |
2 | 16 | 1 | I wish the sky was purple |
2 | 17 | 1 | If the sky was purple can you |
2 | 17 | 2 | imagine the sunsets? |
3 | 22 | 1 | Sand paintings are the best. |
3 | 22 | 2 | You can always get a different |
3 | 22 | 3 | view by changing the orientation. |
3 | 22 | 4 | Just tip the picture and BAM! |
3 | 22 | 5 | You get a brand new landscape |
WANT | |||
ID_NUMBER | COMMENT_NUMBER | INCOMING_COMMENT | |
1 | 10 | This person has four lines on their comment. I would like all four on the same line. This is the final line for this comment. | |
1 | 15 | Jack and Jill went up the hill to fetch a pail of water of water. | |
2 | 16 | I wish the sky was purple | |
2 | 17 | If the sky was purple can you imagine the sunsets? | |
2 | 22 | Sand paintings are the best. You can always get a different view by changing the orientation. Just tip the picture and BAM! You get a brand new landscape |
Thanks very much if someone has the time to help out.
data want (keep = id_number w_comment);
set have (rename=(id_number = in_id_number comment_sequence_number = in_comment_sequence_number comment_line = in_comment))
end = eof_input;
length w_comment $ 1000;
retain id_number comment_sequence_number comment_line w_comment;
/*Initialize the output (retained) columns and return to top of data step.*/
if _n_ = 1 then do;
id_number = in_id_number;
comment_sequence_number = in_comment_sequence_number;
comment_line = in_comment;
w_comment = in_comment;
return;
end;
if id_number = in_id_number and comment_sequence_number = in_comment_sequence_number then do;
w_comment = catx(", ", w_comment, in_comment);
end;
if in_id_number > id_number then do;
output;
id_number = in_id_number;
comment_sequence_number = in_comment_sequence_number;
comment_line = in_comment;
w_comment = in_comment;
return;
end;
/*At end of file on input file write out the last retained record of data.*/
if eof_input then output;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There's two common methods of doing this, one is using a data step and the second is to transpose.
You have two BY variables, so just make sure to have both in your BY statement.
*create sample data for demonstration;
data have;
infile cards dlm='09'x;
input OrgID Product $ States $;
cards;
1 football DC
1 football VA
1 football MD
2 football CA
3 football NV
3 football CA
;
run;
*Sort - required for both options;
proc sort data=have;
by orgID;
run;
**********************************************************************;
*Use RETAIN and BY group processing to combine the information;
**********************************************************************;
data want_option1;
set have;
by orgID;
length combined $100.;
retain combined;
if first.orgID then
combined=states;
else
combined=catx(', ', combined, states);
if last.orgID then
output;
run;
**********************************************************************;
*Transpose it to a wide format and then combine into a single field;
**********************************************************************;
proc transpose data=have out=wide prefix=state_;
by orgID;
var states;
run;
data want_option2;
set wide;
length combined $100.;
combined=catx(', ', of state_:);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
like this?
data HAVE;
infile cards4 dlm="|";
input ID_NUMBER COMMENT_NUMBER COMMENT_SEQUENCE_NUMBER INCOMING_COMMENT : $ 100.;
cards4;
1|10|1|This person has four lines
1|10|2|on their comment. I would like all four
1|10|3|on the same line.
1|10|4|This is the final line for this comment.
1|15|1|Jack and Jill went up the
1|15|2|hill to fetch a pail
1|15|3|of water.
2|16|1|I wish the sky was purple
2|17|1|If the sky was purple can you
2|17|2|imagine the sunsets?
3|22|1|Sand paintings are the best.
3|22|2|You can always get a different
3|22|3|view by changing the orientation.
3|22|4|Just tip the picture and BAM!
3|22|5|You get a brand new landscape
;;;;
run;
proc print;
run;
data want(rename=(IC = INCOMING_COMMENT));
set HAVE;
by ID_NUMBER COMMENT_NUMBER;
retain IC;
length IC $ 32767;
if first.COMMENT_NUMBER then IC = " ";
ic = catx(" ", ic, INCOMING_COMMENT);
if last.COMMENT_NUMBER then output;
drop INCOMING_COMMENT COMMENT_SEQUENCE_NUMBER;
run;
proc print;
run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
There's two common methods of doing this, one is using a data step and the second is to transpose.
You have two BY variables, so just make sure to have both in your BY statement.
*create sample data for demonstration;
data have;
infile cards dlm='09'x;
input OrgID Product $ States $;
cards;
1 football DC
1 football VA
1 football MD
2 football CA
3 football NV
3 football CA
;
run;
*Sort - required for both options;
proc sort data=have;
by orgID;
run;
**********************************************************************;
*Use RETAIN and BY group processing to combine the information;
**********************************************************************;
data want_option1;
set have;
by orgID;
length combined $100.;
retain combined;
if first.orgID then
combined=states;
else
combined=catx(', ', combined, states);
if last.orgID then
output;
run;
**********************************************************************;
*Transpose it to a wide format and then combine into a single field;
**********************************************************************;
proc transpose data=have out=wide prefix=state_;
by orgID;
var states;
run;
data want_option2;
set wide;
length combined $100.;
combined=catx(', ', of state_:);
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Reeza.
I never considered using proc transpose in that way. Very simple and efficient suggestion. Thank you.