Hello I am having trouble grouping/concentrating multiple rows into one in DataFlux using group column.
My input data looks like this :
Data INPUT group value 1 abc 1 def 2 ghi 2 jkl 2 mno 3 pqr
and I need to get result like this:
Data OUTPUT group value 1 abc, def 2 ghi, jkl, mno 3 pqr
I found many solutions for this situations for other SAS product, but I can't find one for SAS DataFlux.
I was looking for group by and transpose + concentrate options, but i was unable to find the right one.
Any suggestions how to resolve this ?
Thank you
data have;
input group value $;
datalines;
1 abc
1 def
2 ghi
2 jkl
2 mno
3 pqr
;
data want(drop=value);
do until (last.group);
set have;
length v $ 200;
by group;
v=catx(', ', v, value);
end;
run;
Result:
group v 1 abc, def 2 ghi, jkl, mno 3 pqr
This is how I would do it in SAS code:
data have;
input group value :$3.;
datalines;
1 abc
1 def
2 ghi
2 jkl
2 mno
3 pqr
;
/* determine the length needed for the concatenation */
proc sql noprint;
select max(count) * 4 - 1 into :length_needed
from (select count(*) as count length=3 from have group by group);
quit;
data want;
set have (rename=(value=_value));
by group;
length value $&length_needed.;
retain value;
if first.group
then value = _value;
else value = catx(',',value,_value);
if last.group;
drop _value;
run;
Another possibility:
/*Retrieve the max number of elements to be concatenated in a macrovariable &nbcol*/
proc sql;
select max(frequency)
into: nbcol
from (select count(value) as frequency
from have
group by group);
quit;
data want;
set have;
/*Transpose all elements in as many columns as needed using an array*/
array value_(&nbcol) $;
by group;
retain value_;
if first.group then do;
count=0;
call missing(of value_(*));
end;
count+1;
do i=1 to dim(value_);
value_(count) = value;
end;
/*Concatenate elements in the same variable*/
if last.group then do;
value_cat = catx(', ',of value_(*));
output;
end;
keep group value_cat;
run;
proc print data=want;
id group;
run;
Best,
Thank You guys, but unfortunately I can't execute SAS code in SAS DataFlux Data Management Studio (at least i don't know how yet).
I'm not a DI User, but isn't there a User Written Transformation?
There is so called 'Expression node' but it uses its custom language (Expression Engine Language EEL).
Documentation of EEL is here:https://support.sas.com/documentation/onlinedoc/dfdmstudio/2.3/dfU_ELRG.pdf
data have;
input group value :$3.;
datalines;
1 abc
1 def
2 ghi
2 jkl
2 mno
3 pqr
;
proc transpose data=have out=temp(drop=_name_);
by group;
var value;
run;
data want;
set temp;
length value $100;
value=catx(', ',of col:);
drop col:;
run;
You also might get an alternate approach if you can describe what advantages come from having multiple values in a single variable.
There are also some details that may have been left out such as what is the result for :
Data INPUT group value 1 abc 1 def 1 abc
or
Data INPUT group value 1 abc 1 def 1 2 ghi 2 jkl 2 mno 3 pqr
And should there be any imposed order on the result such that if you have data like
Data INPUT group value 1 abc 1 def 2 def 2 abc 2 mno 3 pqr
Should the result for group 1 and group 2 be the same?
Does the number of potential values vary for each group as you use this process? What might happen when a new value is associated with group 1? Will any of the following processing need to change?
If it is convenience for a person to read then there may be a report solution that provides such without actually transforming data.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.