I'm working on dispensed medication data. The variables in this data set are sold date, daily supply, daily dose, calculated medication end date, and cluster means the dispense was continuous without gap being >=15 days. If the last medication ending date was > 15 days apart from another opioids medication dispense date, the new cluster started.
solddt | rxsup | dose | med_end_dt | cluster |
09/10/2009 | 16 | 200 | 9/26/2009 | 1 |
10/02/2009 | 2 | 45 | 10/4/2009 | 1 |
10/06/2009 | 66 | 90.91 | 12/11/2009 | 1 |
10/12/2009 | 16 | 200 | 10/28/2009 | 1 |
10/21/2009 | 12 | 62.5 | 11/2/2009 | 1 |
11/03/2009 | 33 | 181.82 | 12/6/2009 | 1 |
11/12/2009 | 2 | 45 | 11/14/2009 | 1 |
11/12/2009 | 16 | 200 | 11/28/2009 | 1 |
12/04/2009 | 33 | 181.82 | 1/6/2010 | 1 |
12/14/2009 | 25 | 128 | 1/8/2010 | 1 |
12/17/2009 | 2 | 45 | 12/19/2009 | 1 |
01/15/2010 | 5 | 45 | 1/20/2010 | 2 |
02//18/2010 | 25 | 90 | 3/15/2010 | 3 |
I'd like to separate the overlapping records into new observations to get a data set that looks like
t0 | dose | t1 | New daily dose | Cluster |
10-Sep-09 | 200 | 9/26/2009 | 200 | 1 |
2-Oct-09 | 45 | 10/4/2009 | 45 | 1 |
6-Oct-09 | 90.91 | 10/11/2009 | 90.91 | 1 |
12-Oct-09 | 200 | 10/20/2009 | 290.91 | 1 |
21-Oct-09 | 62.5 | 10/28/2009 | 353.41 | 1 |
10/29/2009 | 11/2/2009 | 153.41 | 1 | |
3-Nov-09 | 181.82 | 11/11/2009 | 272.73 | 1 |
12-Nov-09 | 200 | 11/14/2009 | 335.91 | 1 |
11/15/2009 | 16 | 11/28/2009 | 290.91 | 1 |
11/29/2020 | 12/3/2009 | 272.73 | 1 | |
4-Dec-09 | 181.82 | 12/6/2009 | 454.55 | 1 |
12/7/2009 | 12/11/2009 | 272.73 | 1 | |
12/12/2009 | 12/13/2009 | 181.82 | 1 | |
14-Dec-09 | 25 | 12/16/2009 | 309.82 | 1 |
17-Dec-09 | 45 | 12/19/2009 | 354.82 | 1 |
20-Dec-09 | 1/6/2010 | 309.82 | 1 | |
1/7/2010 | 1/8/2010 | 128 | 1 | |
15-Jan-10 | 1/20/2010 | 45 | 2 | |
18-Feb-10 | 3/12/2010 | 90 | 3 |
I was able to identify overlapping records using the lag function, but am not sure how to proceed.
Please provide your input data as a SAS data step, and not as a screen capture. See here for instructions.
data test;
infile datalines delimiter = ',';
input solddt: date9. rxsup dose med_end_dt: date9. cluster ;
format solddt yymmdd10. med_end_dt yymmdd10.;
datalines;
10sep2009, 16, 200, 26sep2009, 1
02oct2009, 2, 45, 04oct2009, 1
06oct2009, 66, 90.91, 11dec2009, 1
12oct2009, 16, 200, 28oct2009, 1
21oct2009, 12, 62.5, 02nov2009, 1
03nov2009, 33, 181.82, 06dec2009, 1
12nov2009, 2, 45, 14nov2009, 1
12nov2009, 16, 200, 28nov2009, 1
04dec2009, 33, 181.82, 06jan2010, 1
14dec2009, 25, 128, 08jan2010, 1
17dec2009, 2, 45, 09dec2009, 1
15jan2010, 5, 45, 20jan2010, 2
18feb2010, 25, 90, 15mar2010, 3
;
run;
The input data is provided above. Thank you.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.