Hello
I have to following issue.
I have data that includes few rows for each customer.
data aaa;
infile cards;
input CustomerID group $;
cards;
1 a
1 c
2 b
2 a
2 c
3 a
;
run;
I want to create a new data set that contain 1 row for each customer with a calculated string field called: CustomerGroups that concatenate all groups that a customer belong to
CustomerID CustomerGroups
1 a_c
2 a_b_c
3 a
Please see that the concatenate of the groups should be from low to high.
For example customer 2 belongs to groups a b c so the new field should be a_b_c
like this
data aaa;
infile cards;
input CustomerID group $;
cards;
1 a
1 c
2 b
2 a
2 c
3 a
;
run;
data want(keep=CustomerID CustomerGroups);
set aaa;
by CustomerID;
if first.CustomerID then CustomerGroups=group;
else CustomerGroups=catx('_', CustomerGroups, group);
if last.CustomerID then output;
retain CustomerGroups;
run;
Hi,
Look at proc transpose. You could also loop over the data in a data step.
data aaa;
infile cards;
input CustomerID group $;
cards;
1 a
1 c
2 b
2 a
2 c
3 a
;
run;
proc sort data=aaa;
by CustomerID;
run;
proc transpose data=aaa out=aaatp;
by CustomerID;
var group;
run;
data aaa_concat;
set aaatp;
array col col1-col3;
attrib CustomrGroups length=$6;
CustomerGroups = catx(',', of col[*]);
drop _: col:;
run;
HTH,
Cameron.
Because of the requirement to put the items in alphabetical order, it takes a little more programming. For example:
proc transpose data=aaa out=wide (keep=CustomerID col: ) ;
var group;
by CustomerID;
run;
data want;
set wide;
array groups {*} col: ;
call sortc(of col: ) ;
length CustomerGroups $ 30;
do k=1 to dim(groups);
if groups{k} > ' ' then CustomerGroups = catx('_', CustomerGroups, groups{k});
end;
keep CustomerID CustomerGroups;
run;
It's untested code, so might need a small tweak. But it should contain all the proper steps.
data want;
array temp(100) $20 ;
call missing(of temp(*));
do _n_=1 by 1 until( last.CustomerID);
set aaa;
by CustomerID ;
temp(_n_)= Group;
if last.CustomerID then do;call sortc(of temp(*));
customergroups=trim(catx('_', of temp:)) ;
end;
end;
drop temp:;
run;
Until SAS implements an aggregate function to do this in SAS/SQL (hint, hint ) I always do this:
proc sort data=aaa; by customerId group; run;
data want;
length customerGroups $24;
do until(last.customerid);
set aaa; by customerid;
customerGroups = catx("_", customerGroups, group);
end;
drop group;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.