05-03-2018 01:01 AM
I have a dataset that looks like the following.
I want to count the number of days for which one or more drugs are used. I use the following codes:
do day=from_date to to_date;
create table counted as
select *, count(distinct drug) as num_drug
group by id, day;
The problem with this code is that I cannot know which combination people are in and also sometimes undercounts and overcounts the number of drugs used. In the above example, I would want the combination of A, B and C_D to be counted as four (C_D is a combination product which contains two drugs), but I get the count as three. To account for this I created another dataset using the following code:
create table new_counted as
select *, case when drug in ("A_B", "C_D") then num_drug+1
else num_drug end as new_count
group by id, day,new_count desc;
This solves one problem but creates another. In the above example for id 112, it would give me a count of 3 which would be wrong (A and A_B should equal to two drug use). I am running in circles here. How could I get both number of days for total drugs use along with the type of combination? Thank you!
05-03-2018 04:13 AM
Post test data in the form of a datastep, I am not here to type in test data or guess formats. I would say the steps are:
1) Expand your current data creating a row for each drug, i.e. c_d would be split into two rows.
2) Sort by drug
3) datastep retain a count of days with a by group of drug