Solved: Re: Summarize overlapping medications

lillymaginta · Posted 10-12-2016 11:08 AM

Hi All, I have the following dataset

id rx
1  a
1  b
1  d
2  a 
2  b
3  a
3 c
4  b
5  a
5  d
6  a
6  c
7  a
7  b

I am trying to summarize these combination as aggregate, here is the output I want

a-b-d 1

a-b 2

a-c 2

b 1

a-d 1

ballardw · Posted 10-12-2016 01:37 PM

For data in that form this works:

data have;
   input patient_id rx $;
datalines;
1              a
1              b
1              c
2              a
2              b
3              a
3              b
;
run;

proc sort data=have; 
  by patient_id rx;
run;

proc transpose data=have
   out=trans;
by patient_id;
var rx;
run;

data want;
   set trans;
   array c col: ;
   length rx_list $ 200;
   rx_list= catx('-',of c(*));
   drop col: _name_ ;
run;

But this will only work if there is one RX per record and the RX doesn't repeat, unless you want a-a-b. If you don't want the repeats then add NODUPKEY to the Proc sort.

The sort also is used to keep from getting a-b and b-a as different results.

I am concerned about the patient 4 in your original post though. If you have multiple RX in a single record then you should consider making sure the structure is one RX per patient record which will involve some additional code.

View solution in original post

ballardw · Posted 10-12-2016 11:14 AM

How do you get "a-c 2" and "a-d 1" from that data? You seem to be missing a detail or two on what you are counting and how to count it. At least I think those are supposed to be counts.

How many values may be in RX for a single row?

lillymaginta · Posted 10-12-2016 11:26 AM

For some reason, when I edit the dataset i get it correct but it is not when posted. Here are the correct entries:

id rx
1  a
1  b
1  d
2  a 
2  b
3  a
3 c
4  b
5  a
5  d
6  a
6  c
7  a
7  b

for patient 4 (b), patient 5 (a, d), patient 6 (a, c) and patients 7 (a,b) so you get 2 for a-c because patient 3 had a-c and patient 6 has a-c too. You are right for a-d, the triple combination can be counted under different category so for patient 1, it could be a-b, a-d and b-d and a-b-d but for smipliciy, I would count this only once as a-b-c so patient can be counted only one time in the aggregate measure.

ballardw · Posted 10-12-2016 11:48 AM

Patient? Your example data does not cleary identify "Patient". If the ID is Patient then the example data is incomplete for the summary result shown.

lillymaginta · Posted 10-12-2016 11:58 AM

Apologize if I am not clear, I will simplify my example.

patient_id rx
1              a
1              b
1              c
2              a
2              b
3              a
3              b

The output would be

a-b-c 1

a-b 2

ballardw · Posted 10-12-2016 01:37 PM

For data in that form this works:

data have;
   input patient_id rx $;
datalines;
1              a
1              b
1              c
2              a
2              b
3              a
3              b
;
run;

proc sort data=have; 
  by patient_id rx;
run;

proc transpose data=have
   out=trans;
by patient_id;
var rx;
run;

data want;
   set trans;
   array c col: ;
   length rx_list $ 200;
   rx_list= catx('-',of c(*));
   drop col: _name_ ;
run;

But this will only work if there is one RX per record and the RX doesn't repeat, unless you want a-a-b. If you don't want the repeats then add NODUPKEY to the Proc sort.

The sort also is used to keep from getting a-b and b-a as different results.

I am concerned about the patient 4 in your original post though. If you have multiple RX in a single record then you should consider making sure the structure is one RX per patient record which will involve some additional code.

SAS Innovate 2025: Call for Content

Classroom Training Available!