Concatenate

Reply
N/A
Posts: 0

Concatenate

Hi all
i have problem with concatenating rows
here is the sample dataset
pt var1
a101 aaaa
a101 bbbb
a102 cccc
a103 dddd
a103 eeee
a103 fffff
i have to get result like below

pt var1
a101 aaaa, bbbb
a101 cccc
a103 dddd, eeee, ffff
can any one tell me how to solve this
thanks
Frequent Contributor
Posts: 95

Re: Concatenate

The solution I came up with uses SQL and macro variables:

data test;
input var1 $4. var2 $6.;
cards;
a101 aaaa
a101 bbbb
a102 cccc
a103 dddd
a103 eeee
a103 fffff
;
run;

proc sql noprint;;
select count (distinct var1) into :loopcnt
from test;
%let loopcnt = &loopcnt;
select distinct var1 into :V1 - :V&loopcnt
from test;
quit;

%macro loop;
%do i = 1 %to &loopcnt;
proc sql noprint;
select var1, var2 into :Val1, :Val2 separated by ","
from test
where var1 = "&&V&i";
quit;

data tmp;
var1 = "&Val1";
length String $256;
String = "&Val2";
run;

proc datasets library=work nolist;
%if &i = 1 %then %do; delete out; %end;
append base=out data=tmp;
run;
quit;

%end;
%mend loop;
%loop;
SAS Super FREQ
Posts: 8,641

Re: Concatenate

My approach would have been different from a macro approach. I stuck with DATA step for the solution, assuming the data was stored in a file called PTINFO, with the variables VAR1 (patient id) and VAR2:
[pre]

proc sort data=ptinfo out=ptinfo;
by var1;
run;

data newpt(keep=var1 newvar2);
set ptinfo; by var1;
length newvar2 $200 sep $2;
retain sep ', ' newvar2;
if first.var1 then newvar2 = ' ';
newvar2 = catx(sep,newvar2,var2);
if last.var1 then output;
run;

proc print data=newpt;
title 'New PT file';
run;

[/pre]
Yields this result from the proc print:
[pre]
New PT file

Obs var1 newvar2

1 a101 aaaa, bbbb
2 a102 cccc
3 a103 dddd, eeee, fffff

[/pre]
The ability to use BY group processing and FIRST.VAR1 and LAST.VAR1 made it possible to read through every PT observation and build a new variable called NEWVAR2 by concatenating the single value for VAR2 onto the value of NEWVAR2 for every row. The only other things needed to make this work are the reinitialization of NEWVAR1 for the first record for a patient and then to control the output using an explicit OUTPUT statement on the last record for a patient.

cynthia
Post a Question
Discussion Stats
  • 2 replies
  • 284 views
  • 0 likes
  • 3 in conversation