BookmarkSubscribeRSS Feed
jerry898969
Pyrite | Level 9
Hello to all,
I've run into an issue that I need a bit of help with.

I have a table that has a ID, ONum, notes columns.
What i'm trying to do concatenate every notes obs into one obs based on similer
ID & ONum with a space inbetween every concatenation.

ID ONum notes
-- --------- ----------
11 10001 "This is a Test"
11 10001 " of SAS"
11 10002 "Today the weather was good"
11 10002 "Busy Shopping"
44 19999 "Fishing"
55 77777 "Working on car"
55 77777 "Hunting"
55 77777 "Swimming"

After data step I need the data to look like this

ID ONum notes
-- --------- ----------
11 10001 "This is a Test of SAS"
11 10002 "Today the weather was good Busy Shopping"
44 19999 "Fishing"
55 77777 "Working on car Hunting Swimming"


Thank you to anyone who can help
10 REPLIES 10
deleted_user
Not applicable
One approach:


proc sort data=test out=sorted;
by id Onum;
run;

data concat;
length total_notes $ 200 /* need to be the max */;
set sorted;
by ID ONum;

retain total_notes;

if first.Onum then total_notes = notes;
else
total_notes=trim(total_notes) || ' ' || trim(notes);


if last.ONum then output;
run;
jerry898969
Pyrite | Level 9
kmg,
Thank you so much I think that did it.

Do you know if there is a way to put a return between the notes?

Currently the field is defined as length notes $32767. If the field is defined as that big is there anything bigger? I have no control over the field size i'm just afraid that if they have 4 rows that have almost that many characters what will happen when it tries to concatenate them?

Thanks again for all your help
ballardw
Super User
> kmg,
>
> Do you know if there is a way to put a return between
> the notes?

From the first response modify:

if last.ONum then output;
run;

To:
if last.ONum then do;
output;
total_notes= "";
output;/* this puts a blank value for the note field, insert appropriate character code if you want an actual return character */
end;
run;

>
> Currently the field is defined as length notes
> $32767. If the field is defined as that big is there
> anything bigger? I have no control over the field
> size i'm just afraid that if they have 4 rows that
> have almost that many characters what will happen
> when it tries to concatenate them?
>

No clue but some create use of a FILE statement and PUT with @ and @@ may create a text file without that size limitation.
barheat
Fluorite | Level 6
Jerry:

Assume that your data is in a table called temp. First step is to transpose the data.

proc transpose data=temp out=trans;
var notes;
by id onum;
run;

Second step is to use the catx function to concatenate the columns.

data trans;
set trans;
new_notes=catx(' ',col1,col2,col3);
keep id onum new_notes;
run;

The issue for a program is to determine the maximum frequency for the id onum combination. This could be done in a macro to specify the number of col arguments for CATX
jerry898969
Pyrite | Level 9
Barheat,
Thank you so much for your help. I'm going to try what you suggested.

Thank You
deleted_user
Not applicable
Note that the Proc Transpose approach as given assumes that you can specify what the maximum number of strings will be (in your example, it was 3, but it may vary from data set to data set). It would be better to switch to using an array to avoid this limitation.
jerry898969
Pyrite | Level 9
kmg,
I was just going to post a question about that.
I'm going to stick with you example for now. It seems to be working good for now.

Thank you so much to both of you.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
While using the PROC TRANSPOSE approach, precede the PROC execution with a DATA step that counts and increments a variable every "n" times, then you have flexibility controlling the transposed observation range.

Scott Barry
SBBWorks, Inc.
data_null__
Jade | Level 19
The idea to PROC TRANSPOSE and CATX is good because you don't need to know HOW MANY. Also CATX will print warning if the concatenation will not fit in the variable, see example below.

[pre]
data test;
input ID:$2. ONum:$5. notes &$quote64.;
cards;
11 10001 "This is a Test"
11 10001 " of SAS"
11 10002 "Today the weather was good"
11 10002 "Busy Shopping"
44 19999 "Fishing"
55 77777 "Working on car"
55 77777 "Hunting"
55 77777 "Swimming"
;;;;
run;
proc print;
run;
proc transpose out=wide(drop=_name_);
by id onum;
var notes;
run;
data wide;
length note $35; *To test overrun;
set wide;
note = catx(' ',of col:);
drop col:;
run;
proc print;
run;
[/pre]
ravi_mandal
Calcite | Level 5
Data task;
Input ID ONum notes & $ 30.;
cards;
11 10001 This is a Test
11 10001 of SAS
11 10002 Today the weather was good
11 10002 Busy Shopping
44 19999 Fishing
55 77777 Working on car
55 77777 Hunting
55 77777 Swimming
;
run;


Data task_final;
retain id onum ;
length new $ 50;
do until (last.onum);
set task;
by onum;
new=catx(" ",new,notes);
end;
Keep ID ONum new;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 14896 views
  • 0 likes
  • 7 in conversation