Hello
I have to following issue.
I have data that includes few rows for each customer.
data aaa;
infile cards;
input CustomerID group $;
cards;
1 a
1 c
2 b
2 a
2 c
3 a
;
run;
I want to create a new data set that contain 1 row for each customer with a calculated string field called: CustomerGroups that concatenate all groups that a customer belong to
CustomerID CustomerGroups
1 a_c
2 a_b_c
3 a
Please see that the concatenate of the groups should be from low to high.
For example customer 2 belongs to groups a b c so the new field should be a_b_c
like this
data aaa;
infile cards;
input CustomerID group $;
cards;
1 a
1 c
2 b
2 a
2 c
3 a
;
run;
data want(keep=CustomerID CustomerGroups);
set aaa;
by CustomerID;
if first.CustomerID then CustomerGroups=group;
else CustomerGroups=catx('_', CustomerGroups, group);
if last.CustomerID then output;
retain CustomerGroups;
run;
Hi,
Look at proc transpose. You could also loop over the data in a data step.
data aaa;
infile cards;
input CustomerID group $;
cards;
1 a
1 c
2 b
2 a
2 c
3 a
;
run;
proc sort data=aaa;
by CustomerID;
run;
proc transpose data=aaa out=aaatp;
by CustomerID;
var group;
run;
data aaa_concat;
set aaatp;
array col col1-col3;
attrib CustomrGroups length=$6;
CustomerGroups = catx(',', of col[*]);
drop _: col:;
run;
HTH,
Cameron.
Because of the requirement to put the items in alphabetical order, it takes a little more programming. For example:
proc transpose data=aaa out=wide (keep=CustomerID col: ) ;
var group;
by CustomerID;
run;
data want;
set wide;
array groups {*} col: ;
call sortc(of col: ) ;
length CustomerGroups $ 30;
do k=1 to dim(groups);
if groups{k} > ' ' then CustomerGroups = catx('_', CustomerGroups, groups{k});
end;
keep CustomerID CustomerGroups;
run;
It's untested code, so might need a small tweak. But it should contain all the proper steps.
data want;
array temp(100) $20 ;
call missing(of temp(*));
do _n_=1 by 1 until( last.CustomerID);
set aaa;
by CustomerID ;
temp(_n_)= Group;
if last.CustomerID then do;call sortc(of temp(*));
customergroups=trim(catx('_', of temp:)) ;
end;
end;
drop temp:;
run;
Until SAS implements an aggregate function to do this in SAS/SQL (hint, hint ) I always do this:
proc sort data=aaa; by customerId group; run;
data want;
length customerGroups $24;
do until(last.customerid);
set aaa; by customerid;
customerGroups = catx("_", customerGroups, group);
end;
drop group;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.