BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Jeff_DOC
Pyrite | Level 9

Good morning. 

 

I am attempting to concatenate rows of data from multiple line onto one line. I can't see to get this work. I've deconstructed it to see if I can tell what's going on but I can't see to locate the error. What I end up with is a single line per ID_NUMBER with only the first comment. I don't get the subsequent concatenations for the same ID_NUMBER. 

 

I'm sure there must be a better way to do this but I'm trying to understand how such a data step works.

 

HAVE      
ID_NUMBER COMMENT_NUMBER COMMENT_SEQUENCE_NUMBER INCOMING_COMMENT
1 10 1 This person has four lines
1 10 2 on their comment. I would like all four
1 10 3 on the same line.
1 10 4 This is the final line for this comment.
1 15 1 Jack and Jill went up the
1 15 2 hill to fetch a pail
1 15 3 of water.
2 16 1 I wish the sky was purple
2 17 1 If the sky was purple can you
2 17 2 imagine the sunsets?
3 22 1 Sand paintings are the best.
3 22 2 You can always get a different
3 22 3 view by changing the orientation.
3 22 4 Just tip the picture and BAM!
3 22 5 You get a brand new landscape
       
WANT      
ID_NUMBER COMMENT_NUMBER   INCOMING_COMMENT
1 10   This person has four lines on their comment. I would like all four on the same line. This is the final line for this comment.
1 15   Jack and Jill went up the hill to fetch a pail of water of water.
2 16   I wish the sky was purple
2 17   If the sky was purple can you imagine the sunsets?
2 22   Sand paintings are the best. You can always get a different view by changing the orientation. Just tip the picture and BAM! You get a brand new landscape

Thanks very much if someone has the time to help out.

data want (keep = id_number w_comment);
	set have (rename=(id_number = in_id_number comment_sequence_number = in_comment_sequence_number comment_line = in_comment))
		end = eof_input;
	length w_comment $ 1000;
	retain id_number comment_sequence_number comment_line w_comment;
	/*Initialize the output (retained) columns and return to top of data step.*/
	if _n_ = 1 then do;
		id_number = in_id_number;
		comment_sequence_number = in_comment_sequence_number;
		comment_line = in_comment;
		w_comment = in_comment;
		return;
	end;
	if id_number = in_id_number and comment_sequence_number = in_comment_sequence_number then do;
		w_comment = catx(", ", w_comment, in_comment);
	end;
	if in_id_number > id_number then do;
		output;
		id_number = in_id_number;
		comment_sequence_number = in_comment_sequence_number;
		comment_line = in_comment;
		w_comment = in_comment;
		return;
	end;
	/*At end of file on input file write out the last retained record of data.*/
	if eof_input then output;
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

There's two common methods of doing this, one is using a data step and the second is to transpose.

 

You have two BY variables, so just make sure to have both in your BY statement. 

 

*create sample data for demonstration;
data have;
    infile cards dlm='09'x;
    input OrgID Product $   States $;
    cards;
1   football    DC
1   football    VA
1   football    MD
2   football    CA
3   football    NV
3   football    CA
;
run;

*Sort - required for both options;
proc sort data=have;
    by orgID;
run;

**********************************************************************;
*Use RETAIN and BY group processing to combine the information;
**********************************************************************;
data want_option1;
    set have;
    by orgID;
    length combined $100.;
    retain combined;

    if first.orgID then
        combined=states;
    else
        combined=catx(', ', combined, states);

    if last.orgID then
        output;
run;

**********************************************************************;
*Transpose it to a wide format and then combine into a single field;
**********************************************************************;
proc transpose data=have out=wide prefix=state_;
    by orgID;
    var states;
run;

data want_option2;
    set wide;
    length combined $100.;
    combined=catx(', ', of state_:);
run;

View solution in original post

3 REPLIES 3
yabwon
Onyx | Level 15

like this?

data HAVE;
infile cards4 dlm="|";
input ID_NUMBER COMMENT_NUMBER COMMENT_SEQUENCE_NUMBER INCOMING_COMMENT : $ 100.;
cards4;
1|10|1|This person has four lines
1|10|2|on their comment. I would like all four
1|10|3|on the same line.
1|10|4|This is the final line for this comment.
1|15|1|Jack and Jill went up the
1|15|2|hill to fetch a pail
1|15|3|of water.
2|16|1|I wish the sky was purple
2|17|1|If the sky was purple can you
2|17|2|imagine the sunsets?
3|22|1|Sand paintings are the best.
3|22|2|You can always get a different
3|22|3|view by changing the orientation.
3|22|4|Just tip the picture and BAM!
3|22|5|You get a brand new landscape
;;;;
run;
proc print;
run;

data want(rename=(IC = INCOMING_COMMENT));
  set HAVE;
  by ID_NUMBER COMMENT_NUMBER;
  retain IC;
  length IC $ 32767;

  if first.COMMENT_NUMBER then IC = " ";

  ic = catx(" ", ic, INCOMING_COMMENT);

  if last.COMMENT_NUMBER then output;

  drop INCOMING_COMMENT COMMENT_SEQUENCE_NUMBER;
run;
proc print;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Reeza
Super User

There's two common methods of doing this, one is using a data step and the second is to transpose.

 

You have two BY variables, so just make sure to have both in your BY statement. 

 

*create sample data for demonstration;
data have;
    infile cards dlm='09'x;
    input OrgID Product $   States $;
    cards;
1   football    DC
1   football    VA
1   football    MD
2   football    CA
3   football    NV
3   football    CA
;
run;

*Sort - required for both options;
proc sort data=have;
    by orgID;
run;

**********************************************************************;
*Use RETAIN and BY group processing to combine the information;
**********************************************************************;
data want_option1;
    set have;
    by orgID;
    length combined $100.;
    retain combined;

    if first.orgID then
        combined=states;
    else
        combined=catx(', ', combined, states);

    if last.orgID then
        output;
run;

**********************************************************************;
*Transpose it to a wide format and then combine into a single field;
**********************************************************************;
proc transpose data=have out=wide prefix=state_;
    by orgID;
    var states;
run;

data want_option2;
    set wide;
    length combined $100.;
    combined=catx(', ', of state_:);
run;
Jeff_DOC
Pyrite | Level 9

Thank you Reeza.

 

I never considered using proc transpose in that way. Very simple and efficient suggestion. Thank you.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 2472 views
  • 2 likes
  • 3 in conversation