Hello all!
I have a dataset in which people were on any number of medications during their follow-up. I would like to count the number of meds that people were on.
Below is what I have:
Id start_date end_date med
1 1/10/15 7/09/15 A
1 1/05/16 8/01/16 A
1 3/10/15 3/01/16 B
1 4/10/15 12/15/15 C
This is what I would like:
Id start_date end_date count
1 1/10/15 3/10/15 1
1 3/10/15 4/10/15 2
1 4/10/15 7/09/15 3
1 7/09/15 12/15/15 2
1 12/15/15 1/05/16 1
1 1/05/16 3/01/16 2
1 3/01/16 8/01/16 1
Below I have created a diagram that shows what the first dataset has.
Ultimately, any date that a person was either added on to a new medication or had a medication stopped, I would like to update the med count accordingly. That is why the dataset that I want has more rows than the first dataset.
Thank you for any advice!
First, I'd expand the dataset so that for every medication, you get a strict sequence of days. Next sort by id and date, and do a count per day. Then read through that and start a new sequence anytime the count changes.
data int1 (keep=id date med);
set have;
do date = start_date to end_date;
output;
end;
run;
proc sort data=int1;
by id date;
run;
data int2 (keep=id date count);
set int1;
by id date;
if first.date
then count = 1;
else count + 1;
if last.date then output;
run;
data want (keep=id start_date date count rename=(date=end_date));
set int2;
by id count notsorted;
retain start_date;
if first.count then start_date = date;
if last.count then output;
run;
untested, as I'm on my tablet at the airport to Cologne 🙂
First, I'd expand the dataset so that for every medication, you get a strict sequence of days. Next sort by id and date, and do a count per day. Then read through that and start a new sequence anytime the count changes.
data int1 (keep=id date med);
set have;
do date = start_date to end_date;
output;
end;
run;
proc sort data=int1;
by id date;
run;
data int2 (keep=id date count);
set int1;
by id date;
if first.date
then count = 1;
else count + 1;
if last.date then output;
run;
data want (keep=id start_date date count rename=(date=end_date));
set int2;
by id count notsorted;
retain start_date;
if first.count then start_date = date;
if last.count then output;
run;
untested, as I'm on my tablet at the airport to Cologne 🙂
Wow, I adapted your code to my dataset, and it really worked! Thank you so much, KurtBremser!
I'll just note, for anyone else who tries this code, when I first tried it, it gave me an error stating that the "do" function was not working. I realized that I had some rows where the start_date and end_date were missing (for people not on any meds). Once I deleted those rows (in temporary dataset), the code worked perfectly. Then I just had to change the count to 0 for those people deleted from the temporary dataset that were not on meds.
Thank you again!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.