BookmarkSubscribeRSS Feed
SASMom2
Fluorite | Level 6

I need help with an issue I am struggling with. I have a table with two variables. The ID variable is a unique identifier and the Cust_ID field is the customer IDs for the unique ID values. An ID can have one or multiple cust_ids. I want to create a third variable multi_cust_id which lists all cust_id values for each IDs. As you can see in my table below for ID '1' there are two cust_ids so the new multi_cust_id has both cust_ids listed in both lines where ID = 1. They all need a single quote surrounding them and be separated by a comma. 

 

So, how do create the multi_cust_id variable?

 

Below is my made up table. Thanks a lot.

IDCust_idmulti_cust_id
11234512345', '56789'
15678912345', '56789'
22345623456', '67890', '34567', '78901'
26789023456', '67890', '34567', '78901'
23456723456', '67890', '34567', '78901'
27890123456', '67890', '34567', '78901'
33567835678'
45432154321', '23459', '34557', '54390'
42345954321', '23459', '34557', '54390'
43455754321', '23459', '34557', '54390'
45439054321', '23459', '34557', '54390'
53488934889'

 

9 REPLIES 9
Reeza
Super User
What does your original data look like?

Here's an example of two different ways to do this:
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
SASMom2
Fluorite | Level 6

Thanks Reeza

I changed the ID values to texts. All three variables are text/string variables. Below is slightly revised table.

 

IDCust_idmulti_cust_id
ABC1234512345', '56789'
ABC5678912345', '56789'
DEF2345623456', '67890', '34567', '78901'
DEF6789023456', '67890', '34567', '78901'
DEF3456723456', '67890', '34567', '78901'
DEF7890123456', '67890', '34567', '78901'
HIJ3567835678'
KLM5432154321', '23459', '34557', '54390'
KLM2345954321', '23459', '34557', '54390'
KLM3455754321', '23459', '34557', '54390'
KM5439054321', '23459', '34557', '54390'
NOP3488934889'
Reeza
Super User

The linked solutions didn't work for you? It should generate your multi_cust_id and you can then merge that output with your original data to get what you want. You can do it in a single step with a DoW loop but that's advanced coding that's more complicated to modify and update.

The types of your ID variable shouldn't matter to the code.

SASMom2
Fluorite | Level 6
I think it would work. I ran it and I think I will use the option 2. I just have to try it out using my table.
Thanks again for your help. I will post the update.
PaigeMiller
Diamond | Level 26

I'm really curious as to how this format '123456','234567' is really useful. It seems to me adding in quotes and commas just makes it harder to deal with. But anyway ...

 

data have;
input ID $	Cust_id	$12.;
length cust_id1 $ 14;
cust_id1=quote(trim(cust_id),"'");
cards;
ABC	12345	
ABC	56789	
DEF	23456	
DEF	67890	
DEF	34567	
DEF	78901	
HIJ	35678	
KLM	54321	
KLM	23459	
KLM	34557	
KM	54390	
NOP	34889
;
proc transpose data=have out=have_t;
    by id;
    var cust_id1;
run;
data want;
    set have_t;
    multi_cust_id=catx(',',of col:);
run;
data want2;
    merge have want(keep=id multi_cust_id);
    by id;
    drop cust_id1;
run;
--
Paige Miller
SASMom2
Fluorite | Level 6
Basically, the muti_cust_id field is for reporting purpose only. At the end of the code when it creates an excel report, it lists cust_id fo each id. So, if we use cust_id field, for the ids that have mutiple cust_ids, there would be multiple lines. But if we use multi_cust_id field, there would be only one line per ID regardless on cust_ids linked to each ID.

Hope that makes sense.
PaigeMiller
Diamond | Level 26

I think your explanation is clear, but the need to create mult_cust_id with commas and quotes seems off-target. 

 

If I read your explanation literally, the need for quotes and commas seems to disappear. You get the results with one line per ID, even if mult_cust_id is created without the quotes and without the commas. And this saves a little bit of programming to get this output, and maybe even saves more programming down the road.

--
Paige Miller
Tom
Super User Tom
Super User

You need to make a new variable.  It needs to be long enough to hold the longest string you will generate for any of the groups.

data have ;
  input ID $ Cust_id $;
cards;
ABC 12345
ABC 56789
DEF 23456
DEF 67890
DEF 34567
DEF 78901
HIJ 35678
KLM 54321
KLM 23459
KLM 34557
KM  54390
NOP 34889
;

data want;
  do until (last.id);
    set have;
    by id;
    length multi_cust_id $200;
    multi_cust_id = catx(' ',multi_cust_id,cust_id);
  end;
  do until (last.id);
    set have;
    by id;
    output;
  end;
run;

If you really want the quotes, commas and extra spaces then change this line.

    multi_cust_id = catx(', ',multi_cust_id,quote(trim(cust_id),"'"));

Make sure to include the extra three characters generated per id when calculating how long the new variable needs to be.

SASMom2
Fluorite | Level 6

Thanks you all for your reply. You saved me a lot of manual work in Excel.

I ended up using Tom's solution. It gave me exactly what I needed.

Thank you!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 508 views
  • 2 likes
  • 4 in conversation