Hello I am having trouble grouping/concentrating multiple rows into one in DataFlux using group column.
My input data looks like this :
Data INPUT group value 1 abc 1 def 2 ghi 2 jkl 2 mno 3 pqr
and I need to get result like this:
Data OUTPUT group value 1 abc, def 2 ghi, jkl, mno 3 pqr
I found many solutions for this situations for other SAS product, but I can't find one for SAS DataFlux.
I was looking for group by and transpose + concentrate options, but i was unable to find the right one.
Any suggestions how to resolve this ?
Thank you
data have;
input group value $;
datalines;
1 abc
1 def
2 ghi
2 jkl
2 mno
3 pqr
;
data want(drop=value);
do until (last.group);
set have;
length v $ 200;
by group;
v=catx(', ', v, value);
end;
run;
Result:
group v 1 abc, def 2 ghi, jkl, mno 3 pqr
This is how I would do it in SAS code:
data have;
input group value :$3.;
datalines;
1 abc
1 def
2 ghi
2 jkl
2 mno
3 pqr
;
/* determine the length needed for the concatenation */
proc sql noprint;
select max(count) * 4 - 1 into :length_needed
from (select count(*) as count length=3 from have group by group);
quit;
data want;
set have (rename=(value=_value));
by group;
length value $&length_needed.;
retain value;
if first.group
then value = _value;
else value = catx(',',value,_value);
if last.group;
drop _value;
run;
Another possibility:
/*Retrieve the max number of elements to be concatenated in a macrovariable &nbcol*/
proc sql;
select max(frequency)
into: nbcol
from (select count(value) as frequency
from have
group by group);
quit;
data want;
set have;
/*Transpose all elements in as many columns as needed using an array*/
array value_(&nbcol) $;
by group;
retain value_;
if first.group then do;
count=0;
call missing(of value_(*));
end;
count+1;
do i=1 to dim(value_);
value_(count) = value;
end;
/*Concatenate elements in the same variable*/
if last.group then do;
value_cat = catx(', ',of value_(*));
output;
end;
keep group value_cat;
run;
proc print data=want;
id group;
run;
Best,
Thank You guys, but unfortunately I can't execute SAS code in SAS DataFlux Data Management Studio (at least i don't know how yet).
I'm not a DI User, but isn't there a User Written Transformation?
There is so called 'Expression node' but it uses its custom language (Expression Engine Language EEL).
Documentation of EEL is here:https://support.sas.com/documentation/onlinedoc/dfdmstudio/2.3/dfU_ELRG.pdf
data have;
input group value :$3.;
datalines;
1 abc
1 def
2 ghi
2 jkl
2 mno
3 pqr
;
proc transpose data=have out=temp(drop=_name_);
by group;
var value;
run;
data want;
set temp;
length value $100;
value=catx(', ',of col:);
drop col:;
run;
You also might get an alternate approach if you can describe what advantages come from having multiple values in a single variable.
There are also some details that may have been left out such as what is the result for :
Data INPUT group value 1 abc 1 def 1 abc
or
Data INPUT group value 1 abc 1 def 1 2 ghi 2 jkl 2 mno 3 pqr
And should there be any imposed order on the result such that if you have data like
Data INPUT group value 1 abc 1 def 2 def 2 abc 2 mno 3 pqr
Should the result for group 1 and group 2 be the same?
Does the number of potential values vary for each group as you use this process? What might happen when a new value is associated with group 1? Will any of the following processing need to change?
If it is convenience for a person to read then there may be a report solution that provides such without actually transforming data.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.