I need help with an issue I am struggling with. I have a table with two variables. The ID variable is a unique identifier and the Cust_ID field is the customer IDs for the unique ID values. An ID can have one or multiple cust_ids. I want to create a third variable multi_cust_id which lists all cust_id values for each IDs. As you can see in my table below for ID '1' there are two cust_ids so the new multi_cust_id has both cust_ids listed in both lines where ID = 1. They all need a single quote surrounding them and be separated by a comma.
So, how do create the multi_cust_id variable?
Below is my made up table. Thanks a lot.
ID | Cust_id | multi_cust_id |
1 | 12345 | 12345', '56789' |
1 | 56789 | 12345', '56789' |
2 | 23456 | 23456', '67890', '34567', '78901' |
2 | 67890 | 23456', '67890', '34567', '78901' |
2 | 34567 | 23456', '67890', '34567', '78901' |
2 | 78901 | 23456', '67890', '34567', '78901' |
3 | 35678 | 35678' |
4 | 54321 | 54321', '23459', '34557', '54390' |
4 | 23459 | 54321', '23459', '34557', '54390' |
4 | 34557 | 54321', '23459', '34557', '54390' |
4 | 54390 | 54321', '23459', '34557', '54390' |
5 | 34889 | 34889' |
Thanks Reeza
I changed the ID values to texts. All three variables are text/string variables. Below is slightly revised table.
ID | Cust_id | multi_cust_id |
ABC | 12345 | 12345', '56789' |
ABC | 56789 | 12345', '56789' |
DEF | 23456 | 23456', '67890', '34567', '78901' |
DEF | 67890 | 23456', '67890', '34567', '78901' |
DEF | 34567 | 23456', '67890', '34567', '78901' |
DEF | 78901 | 23456', '67890', '34567', '78901' |
HIJ | 35678 | 35678' |
KLM | 54321 | 54321', '23459', '34557', '54390' |
KLM | 23459 | 54321', '23459', '34557', '54390' |
KLM | 34557 | 54321', '23459', '34557', '54390' |
KM | 54390 | 54321', '23459', '34557', '54390' |
NOP | 34889 | 34889' |
The linked solutions didn't work for you? It should generate your multi_cust_id and you can then merge that output with your original data to get what you want. You can do it in a single step with a DoW loop but that's advanced coding that's more complicated to modify and update.
The types of your ID variable shouldn't matter to the code.
I'm really curious as to how this format '123456','234567' is really useful. It seems to me adding in quotes and commas just makes it harder to deal with. But anyway ...
data have;
input ID $ Cust_id $12.;
length cust_id1 $ 14;
cust_id1=quote(trim(cust_id),"'");
cards;
ABC 12345
ABC 56789
DEF 23456
DEF 67890
DEF 34567
DEF 78901
HIJ 35678
KLM 54321
KLM 23459
KLM 34557
KM 54390
NOP 34889
;
proc transpose data=have out=have_t;
by id;
var cust_id1;
run;
data want;
set have_t;
multi_cust_id=catx(',',of col:);
run;
data want2;
merge have want(keep=id multi_cust_id);
by id;
drop cust_id1;
run;
I think your explanation is clear, but the need to create mult_cust_id with commas and quotes seems off-target.
If I read your explanation literally, the need for quotes and commas seems to disappear. You get the results with one line per ID, even if mult_cust_id is created without the quotes and without the commas. And this saves a little bit of programming to get this output, and maybe even saves more programming down the road.
You need to make a new variable. It needs to be long enough to hold the longest string you will generate for any of the groups.
data have ;
input ID $ Cust_id $;
cards;
ABC 12345
ABC 56789
DEF 23456
DEF 67890
DEF 34567
DEF 78901
HIJ 35678
KLM 54321
KLM 23459
KLM 34557
KM 54390
NOP 34889
;
data want;
do until (last.id);
set have;
by id;
length multi_cust_id $200;
multi_cust_id = catx(' ',multi_cust_id,cust_id);
end;
do until (last.id);
set have;
by id;
output;
end;
run;
If you really want the quotes, commas and extra spaces then change this line.
multi_cust_id = catx(', ',multi_cust_id,quote(trim(cust_id),"'"));
Make sure to include the extra three characters generated per id when calculating how long the new variable needs to be.
Thanks you all for your reply. You saved me a lot of manual work in Excel.
I ended up using Tom's solution. It gave me exactly what I needed.
Thank you!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.