Hi all,
I have a list of different medications that each ID took. I need to consolidate the medications for each ID with a comma and in alphabetical order. There might be repeat/duplicates but they need to be used only once. For example ID#1 has ax1 listed twice but it needs to be used only once i.e. ID #1 was on ax1 and bx2 medications.
Data Have:
data have;
infile datalines dlm=':';
input ID Meds :$10.;
datalines;
1: ax1
1: ax1
1: bx2
2: cx3
2: dx4
2: dx4
3: ax1
3: ax1
4: ax1
4: bx2
4: cx3
4: dx4
5: cx3
6: ax1
6: bx2
7: ax1
7: bx2
7: cx3
7: dx4
7: ex5
;
Run;
ID | Meds |
1 | ax1 |
1 | ax1 |
1 | bx2 |
2 | cx3 |
2 | dx4 |
2 | dx4 |
3 | ax1 |
3 | ax1 |
4 | ax1 |
4 | bx2 |
4 | cx3 |
4 | dx4 |
5 | cx3 |
6 | ax1 |
6 | bx2 |
7 | ax1 |
7 | bx2 |
7 | cx3 |
7 | dx4 |
7 | ex5 |
Data Want:
ID | All_Meds |
1 | ax1, bx2 |
2 | cx3, dx4 |
3 | ax1 |
4 | ax1, bx2, cx3, dx4 |
5 | cx3 |
6 | ax1, bx2 |
7 | ax1, bx2, cx3, dx4, ex5 |
Hi @newsas007 My share of fun-
Please note, The following is not necessarily the most convenient nor the best for your question. However, i fancied some fun as I haven't been active much here lately. By all means, feel free to wait for a much simpler and convenient one. Have fun!
data have;
infile datalines dlm=':';
input ID 1. Meds :$10.;
datalines;
1: ax1
1: ax1
1: bx2
2: cx3
2: dx4
2: dx4
3: ax1
3: ax1
4: ax1
4: bx2
4: cx3
4: dx4
5: cx3
6: ax1
6: bx2
7: ax1
7: bx2
7: cx3
7: dx4
7: ex5
;
Run;
data want;
if _n_=1 then do;
dcl hash h();
h.definekey('meds');
h.definedone();
end;
do until(last.id);
set have;
by id;
if h.check()=0 then continue;
length all_meds $100;
All_Meds=catx(', ',All_Meds,meds);
h.add();
end;
h.clear();
drop meds;
run;
proc print noobs;run;
ID | all_meds |
---|---|
1 | ax1, bx2 |
2 | cx3, dx4 |
3 | ax1 |
4 | ax1, bx2, cx3, dx4 |
5 | cx3 |
6 | ax1, bx2 |
7 | ax1, bx2, cx3, dx4, ex5 |
@newsas007 wrote:
Hi all,
I have a list of different medications that each ID took. I need to consolidate the medications for each ID with a comma and in alphabetical order. There might be repeat/duplicates but they need to be counted only once.
Data Have:
data have; infile datalines dlm=':'; input ID Meds :$10.; datalines; 1: ax1 1: ax1 1: bx2 2: cx3 2: dx4 2: dx4 3: ax1 3: ax1 4: ax1 4: bx2 4: cx3 4: dx4 5: cx3 6: ax1 6: bx2 7: ax1 7: bx2 7: cx3 7: dx4 7: ex5 ; Run;
ID Meds 1 ax1 1 ax1 1 bx2 2 cx3 2 dx4 2 dx4 3 ax1 3 ax1 4 ax1 4 bx2 4 cx3 4 dx4 5 cx3 6 ax1 6 bx2 7 ax1 7 bx2 7 cx3 7 dx4 7 ex5
Data Want:
ID All_Meds 1 ax1, bx2 2 cx3, dx4 3 ax1 4 ax1, bx2, cx3, dx4 5 cx3 6 ax1, bx2 7 ax1, bx2, cx3, dx4, ex5
I don't see any "counting".
Just what are you going to use that compound variable for? It is almost useless for anything except reporting and reporting could provide the same information easier.
@ballardw : I have edited my post. Hope it makes more sense. All, i am trying to do is to create a dataset with combination of all the medications by ID. thanks
@newsas007 wrote:
@ballardw : I have edited my post. Hope it makes more sense. All, i am trying to do is to create a dataset with combination of all the medications by ID. thanks
I ask why that particular form because 1) we seem to get one or more of these requests every week and 2) almost no one will tell me where they they think that particular form of data is particularly useful.
Here is another approach:
data have;
infile datalines dlm=':';
input ID Meds :$10.;
datalines;
1: ax1
1: ax1
1: bx2
2: cx3
2: dx4
2: dx4
3: ax1
3: ax1
4: ax1
4: bx2
4: cx3
4: dx4
5: cx3
6: ax1
6: bx2
7: ax1
7: bx2
7: cx3
7: dx4
7: ex5
;
run;
proc sort data = have;
by ID Meds;
run;
data want;
set have;
by ID Meds;
retain All_Meds;
length All_Meds $ 50;
if first.ID then All_Meds = '';
if first.Meds then All_Meds = catx(',',All_Meds, Meds);
if last.ID;
run;
Hi @newsas007 My share of fun-
Please note, The following is not necessarily the most convenient nor the best for your question. However, i fancied some fun as I haven't been active much here lately. By all means, feel free to wait for a much simpler and convenient one. Have fun!
data have;
infile datalines dlm=':';
input ID 1. Meds :$10.;
datalines;
1: ax1
1: ax1
1: bx2
2: cx3
2: dx4
2: dx4
3: ax1
3: ax1
4: ax1
4: bx2
4: cx3
4: dx4
5: cx3
6: ax1
6: bx2
7: ax1
7: bx2
7: cx3
7: dx4
7: ex5
;
Run;
data want;
if _n_=1 then do;
dcl hash h();
h.definekey('meds');
h.definedone();
end;
do until(last.id);
set have;
by id;
if h.check()=0 then continue;
length all_meds $100;
All_Meds=catx(', ',All_Meds,meds);
h.add();
end;
h.clear();
drop meds;
run;
proc print noobs;run;
ID | all_meds |
---|---|
1 | ax1, bx2 |
2 | cx3, dx4 |
3 | ax1 |
4 | ax1, bx2, cx3, dx4 |
5 | cx3 |
6 | ax1, bx2 |
7 | ax1, bx2, cx3, dx4, ex5 |
I might suggest sorting that data set by Id and Meds before the hash step.
Otherwise the (IMHO limited) usefulness of that data structure is compromised because the order of the values means the combinations are not the same.
For example switch the order of the ID=6 in the first step and the result is not the same as for Id=1 with the same values.
Agree.
This works great. Just wanted to check how to modify if i add another category. By ID and LOT this time.
data have;
infile datalines dlm=':';
input ID 1. Meds :$10. LOT: $4.;
datalines;
1: ax1: 1
1: ax1: 2
1: bx2: 2
2: cx3: 1
2: dx4: 2
2: dx4: 2
3: ax1: 1
3: ax1: 2
4: ax1: 1
4: bx2: 1
4: cx3: 2
4: dx4: 2
5: cx3: 1
6: ax1: 2
6: bx2: 2
7: ax1: 1
7: bx2: 1
7: cx3: 2
7: dx4: 3
7: ex5: 3
;
Run;
Data Have:
ID | Meds | LOT |
1 | ax1 | 1 |
1 | ax1 | 2 |
1 | bx2 | 2 |
2 | cx3 | 1 |
2 | dx4 | 2 |
2 | dx4 | 2 |
3 | ax1 | 1 |
3 | ax1 | 2 |
4 | ax1 | 1 |
4 | bx2 | 1 |
4 | cx3 | 2 |
4 | dx4 | 2 |
5 | cx3 | 1 |
6 | ax1 | 2 |
6 | bx2 | 2 |
7 | ax1 | 1 |
7 | bx2 | 1 |
7 | cx3 | 2 |
7 | dx4 | 3 |
7 | ex5 | 3 |
Data Want:
ID | LOT | All_Meds |
1 | 1 | ax1 |
1 | 2 | ax1, bx2 |
2 | 1 | cx3 |
2 | 2 | dx4 |
3 | 1 | ax1 |
3 | 1 | ax1 |
4 | 1 | ax1, bx2 |
4 | 2 | cx3, dx4 |
5 | 1 | cx3 |
6 | 2 | ax1, bx2 |
7 | 1 | ax1, bx2 |
7 | 2 | cx3 |
7 | 3 | dx4, ex5 |
@newsas007 Please try sorting the data first
by id lot;
And then change the code to-
do until(last.lot);
set have;
by id lot;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.