I'm working on dispensed medication data. The variables in this data set are sold date, daily supply, daily dose, calculated medication end date, and cluster means the dispense was continuous without gap being >=15 days. If the last medication ending date was > 15 days apart from another opioids medication dispense date, the new cluster started.
solddt | rxsup | dose | med_end_dt | cluster |
09/10/2009 | 16 | 200 | 9/26/2009 | 1 |
10/02/2009 | 2 | 45 | 10/4/2009 | 1 |
10/06/2009 | 66 | 90.91 | 12/11/2009 | 1 |
10/12/2009 | 16 | 200 | 10/28/2009 | 1 |
10/21/2009 | 12 | 62.5 | 11/2/2009 | 1 |
11/03/2009 | 33 | 181.82 | 12/6/2009 | 1 |
11/12/2009 | 2 | 45 | 11/14/2009 | 1 |
11/12/2009 | 16 | 200 | 11/28/2009 | 1 |
12/04/2009 | 33 | 181.82 | 1/6/2010 | 1 |
12/14/2009 | 25 | 128 | 1/8/2010 | 1 |
12/17/2009 | 2 | 45 | 12/19/2009 | 1 |
01/15/2010 | 5 | 45 | 1/20/2010 | 2 |
02//18/2010 | 25 | 90 | 3/15/2010 | 3 |
I'd like to separate the overlapping records into new observations to get a data set that looks like
t0 | dose | t1 | New daily dose | Cluster |
10-Sep-09 | 200 | 9/26/2009 | 200 | 1 |
2-Oct-09 | 45 | 10/4/2009 | 45 | 1 |
6-Oct-09 | 90.91 | 10/11/2009 | 90.91 | 1 |
12-Oct-09 | 200 | 10/20/2009 | 290.91 | 1 |
21-Oct-09 | 62.5 | 10/28/2009 | 353.41 | 1 |
10/29/2009 | 11/2/2009 | 153.41 | 1 | |
3-Nov-09 | 181.82 | 11/11/2009 | 272.73 | 1 |
12-Nov-09 | 200 | 11/14/2009 | 335.91 | 1 |
11/15/2009 | 16 | 11/28/2009 | 290.91 | 1 |
11/29/2020 | 12/3/2009 | 272.73 | 1 | |
4-Dec-09 | 181.82 | 12/6/2009 | 454.55 | 1 |
12/7/2009 | 12/11/2009 | 272.73 | 1 | |
12/12/2009 | 12/13/2009 | 181.82 | 1 | |
14-Dec-09 | 25 | 12/16/2009 | 309.82 | 1 |
17-Dec-09 | 45 | 12/19/2009 | 354.82 | 1 |
20-Dec-09 | 1/6/2010 | 309.82 | 1 | |
1/7/2010 | 1/8/2010 | 128 | 1 | |
15-Jan-10 | 1/20/2010 | 45 | 2 | |
18-Feb-10 | 3/12/2010 | 90 | 3 |
I was able to identify overlapping records using the lag function, but am not sure how to proceed.
Please provide your input data as a SAS data step, and not as a screen capture. See here for instructions.
data test;
infile datalines delimiter = ',';
input solddt: date9. rxsup dose med_end_dt: date9. cluster ;
format solddt yymmdd10. med_end_dt yymmdd10.;
datalines;
10sep2009, 16, 200, 26sep2009, 1
02oct2009, 2, 45, 04oct2009, 1
06oct2009, 66, 90.91, 11dec2009, 1
12oct2009, 16, 200, 28oct2009, 1
21oct2009, 12, 62.5, 02nov2009, 1
03nov2009, 33, 181.82, 06dec2009, 1
12nov2009, 2, 45, 14nov2009, 1
12nov2009, 16, 200, 28nov2009, 1
04dec2009, 33, 181.82, 06jan2010, 1
14dec2009, 25, 128, 08jan2010, 1
17dec2009, 2, 45, 09dec2009, 1
15jan2010, 5, 45, 20jan2010, 2
18feb2010, 25, 90, 15mar2010, 3
;
run;
The input data is provided above. Thank you.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.