Solved: Consolidated categories

newsas007 · Posted 08-26-2021 04:52 PM

Hi all,

I have a list of different medications that each ID took. I need to consolidate the medications for each ID with a comma and in alphabetical order. There might be repeat/duplicates but they need to be used only once. For example ID#1 has ax1 listed twice but it needs to be used only once i.e. ID #1 was on ax1 and bx2 medications.

Data Have:

data have;
   infile datalines dlm=':';
   input ID Meds :$10.;
 datalines;
1:	ax1
1:	ax1
1:	bx2
2:	cx3
2:	dx4
2:	dx4
3:	ax1
3:	ax1
4:	ax1
4:	bx2
4:	cx3
4:	dx4
5:	cx3
6:	ax1
6:	bx2
7:	ax1
7:	bx2
7:	cx3
7:	dx4
7:	ex5
;
Run;

ID	Meds
1	ax1
1	ax1
1	bx2
2	cx3
2	dx4
2	dx4
3	ax1
3	ax1
4	ax1
4	bx2
4	cx3
4	dx4
5	cx3
6	ax1
6	bx2
7	ax1
7	bx2
7	cx3
7	dx4
7	ex5

Data Want:

ID	All_Meds
1	ax1, bx2
2	cx3, dx4
3	ax1
4	ax1, bx2, cx3, dx4
5	cx3
6	ax1, bx2
7	ax1, bx2, cx3, dx4, ex5

novinosrin · Posted 08-26-2021 07:22 PM

Hi @newsas007 My share of fun-

Please note, The following is not necessarily the most convenient nor the best for your question. However, i fancied some fun as I haven't been active much here lately. By all means, feel free to wait for a much simpler and convenient one. Have fun!


data have;
   infile datalines dlm=':';
   input ID 1. Meds :$10.;
 datalines;
1:	ax1
1:	ax1
1:	bx2
2:	cx3
2:	dx4
2:	dx4
3:	ax1
3:	ax1
4:	ax1
4:	bx2
4:	cx3
4:	dx4
5:	cx3
6:	ax1
6:	bx2
7:	ax1
7:	bx2
7:	cx3
7:	dx4
7:	ex5
;
Run;

data want;
  if _n_=1 then do;
    dcl hash h();
    h.definekey('meds');
	h.definedone();
  end;
  do until(last.id);
    set have;
	by id;
	if h.check()=0 then continue;
	length all_meds $100;
	All_Meds=catx(', ',All_Meds,meds);
	h.add();
  end;
  h.clear();
  drop meds;
run;

proc print noobs;run;

ID	all_meds
1	ax1, bx2
2	cx3, dx4
3	ax1
4	ax1, bx2, cx3, dx4
5	cx3
6	ax1, bx2
7	ax1, bx2, cx3, dx4, ex5

View solution in original post

ballardw · Posted 08-26-2021 05:40 PM

@newsas007 wrote:

Hi all,

I have a list of different medications that each ID took. I need to consolidate the medications for each ID with a comma and in alphabetical order. There might be repeat/duplicates but they need to be counted only once.

Data Have:
data have;
   infile datalines dlm=':';
   input ID Meds :$10.;
 datalines;
1:	ax1
1:	ax1
1:	bx2
2:	cx3
2:	dx4
2:	dx4
3:	ax1
3:	ax1
4:	ax1
4:	bx2
4:	cx3
4:	dx4
5:	cx3
6:	ax1
6:	bx2
7:	ax1
7:	bx2
7:	cx3
7:	dx4
7:	ex5
;
Run;
ID Meds

1 ax1

1 ax1

1 bx2

2 cx3

2 dx4

2 dx4

3 ax1

3 ax1

4 ax1

4 bx2

4 cx3

4 dx4

5 cx3

6 ax1

6 bx2

7 ax1

7 bx2

7 cx3

7 dx4

7 ex5

Data Want:

ID All_Meds

1 ax1, bx2

2 cx3, dx4

3 ax1

4 ax1, bx2, cx3, dx4

5 cx3

6 ax1, bx2

7 ax1, bx2, cx3, dx4, ex5

I don't see any "counting".

Just what are you going to use that compound variable for? It is almost useless for anything except reporting and reporting could provide the same information easier.

newsas007 · Posted 08-26-2021 07:01 PM

@ballardw : I have edited my post. Hope it makes more sense. All, i am trying to do is to create a dataset with combination of all the medications by ID. thanks

ballardw · Posted 08-26-2021 07:29 PM

@newsas007 wrote:

@ballardw : I have edited my post. Hope it makes more sense. All, i am trying to do is to create a dataset with combination of all the medications by ID. thanks

I ask why that particular form because 1) we seem to get one or more of these requests every week and 2) almost no one will tell me where they they think that particular form of data is particularly useful.

SASKiwi · Posted 08-26-2021 07:46 PM

Here is another approach:

data have;
   infile datalines dlm=':';
   input ID Meds :$10.;
 datalines;
1:	ax1
1:	ax1
1:	bx2
2:	cx3
2:	dx4
2:	dx4
3:	ax1
3:	ax1
4:	ax1
4:	bx2
4:	cx3
4:	dx4
5:	cx3
6:	ax1
6:	bx2
7:	ax1
7:	bx2
7:	cx3
7:	dx4
7:	ex5
;
run;

proc sort data = have;
  by ID Meds;
run;

data want;
  set have;
  by ID Meds;
  retain All_Meds;
  length All_Meds $ 50; 
  if first.ID then All_Meds = '';
  if first.Meds then All_Meds = catx(',',All_Meds, Meds);
  if last.ID;
run;

novinosrin · Posted 08-26-2021 07:22 PM

Hi @newsas007 My share of fun-

Please note, The following is not necessarily the most convenient nor the best for your question. However, i fancied some fun as I haven't been active much here lately. By all means, feel free to wait for a much simpler and convenient one. Have fun!


data have;
   infile datalines dlm=':';
   input ID 1. Meds :$10.;
 datalines;
1:	ax1
1:	ax1
1:	bx2
2:	cx3
2:	dx4
2:	dx4
3:	ax1
3:	ax1
4:	ax1
4:	bx2
4:	cx3
4:	dx4
5:	cx3
6:	ax1
6:	bx2
7:	ax1
7:	bx2
7:	cx3
7:	dx4
7:	ex5
;
Run;

data want;
  if _n_=1 then do;
    dcl hash h();
    h.definekey('meds');
	h.definedone();
  end;
  do until(last.id);
    set have;
	by id;
	if h.check()=0 then continue;
	length all_meds $100;
	All_Meds=catx(', ',All_Meds,meds);
	h.add();
  end;
  h.clear();
  drop meds;
run;

proc print noobs;run;

ID	all_meds
1	ax1, bx2
2	cx3, dx4
3	ax1
4	ax1, bx2, cx3, dx4
5	cx3
6	ax1, bx2
7	ax1, bx2, cx3, dx4, ex5

ballardw · Posted 08-26-2021 07:35 PM

@novinosrin

I might suggest sorting that data set by Id and Meds before the hash step.

Otherwise the (IMHO limited) usefulness of that data structure is compromised because the order of the values means the combinations are not the same.

For example switch the order of the ID=6 in the first step and the result is not the same as for Id=1 with the same values.

novinosrin · Posted 08-26-2021 07:50 PM

Agree.

newsas007 · Posted 08-26-2021 07:56 PM

@ballardw @novinosrin

This works great. Just wanted to check how to modify if i add another category. By ID and LOT this time.

data have;
   infile datalines dlm=':';
   input ID 1. Meds :$10. LOT: $4.;
 datalines;
1:	ax1: 1
1:	ax1: 2
1:	bx2: 2
2:	cx3: 1
2:	dx4: 2
2:	dx4: 2
3:	ax1: 1
3:	ax1: 2
4:	ax1: 1
4:	bx2: 1
4:	cx3: 2
4:	dx4: 2
5:	cx3: 1
6:	ax1: 2
6:	bx2: 2
7:	ax1: 1
7:	bx2: 1
7:	cx3: 2
7:	dx4: 3
7:	ex5: 3
;
Run;

Data Have:

ID	Meds	LOT
1	ax1	1
1	ax1	2
1	bx2	2
2	cx3	1
2	dx4	2
2	dx4	2
3	ax1	1
3	ax1	2
4	ax1	1
4	bx2	1
4	cx3	2
4	dx4	2
5	cx3	1
6	ax1	2
6	bx2	2
7	ax1	1
7	bx2	1
7	cx3	2
7	dx4	3
7	ex5	3

Data Want:

ID	LOT	All_Meds
1	1	ax1
1	2	ax1, bx2
2	1	cx3
2	2	dx4
3	1	ax1
3	1	ax1
4	1	ax1, bx2
4	2	cx3, dx4
5	1	cx3
6	2	ax1, bx2
7	1	ax1, bx2
7	2	cx3
7	3	dx4, ex5

novinosrin · Posted 08-26-2021 08:10 PM

@newsas007 Please try sorting the data first

   by id lot;

And then change the code to-

  do until(last.lot);
    set have;
	by id lot;

Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

Re: Consolidated categories

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away