Dear all,
I have a data set with multiple observations for each individual (ID), with information on number of treatment days (days) during a certain time interval (date_from, date_to).
For each individual, I want to combine/merge treatment intervals less then 30 days apart, to end up with the sum of treatment days and total treatment time period for the different observations less then 30 days apart.
Do you have any suggestions how should I write the SAS program to end up with the data arranged as described above?
In the example below, I'll try to illustrate what I mean:
Input data:
data test_1;
input ID days date_from mmddyy12. Date_to mmddyy12. ;
format date_from date_to mmddyy10.;
datalines;
1 10 01/05/2011 01/15/2011
1 8 02/07/2011 02/17/2011
1 5 04/09/2012 04/19/2012
2 9 01/11/2011 01/21/2011
2 2 01/31/2011 02/05/2011
2 3 02/21/2011 04/02/2011
2 5 04/10/2014 05/03/2014
3 12 02/15/2011 03/20/2011
3 10 01/16/2012 02/10/2012
3 3 03/08/2012 03/27/2012
3 2 12/12/2014 02/07/2015
;
run;
proc print data=test_1;
run;
Obs | ID | days | Date_from | Date_to |
1 | 1 | 10 | 01/05/2011 | 01/15/2011 |
2 | 1 | 8 | 02/07/2011 | 02/17/2011 |
3 | 1 | 5 | 04/09/2012 | 04/19/2012 |
4 | 2 | 9 | 01/11/2011 | 01/21/2011 |
5 | 2 | 2 | 01/31/2011 | 02/05/2011 |
6 | 2 | 3 | 02/21/2011 | 04/02/2011 |
7 | 2 | 5 | 04/10/2014 | 05/03/2014 |
8 | 3 | 12 | 02/15/2011 | 03/20/2011 |
9 | 3 | 10 | 01/16/2012 | 02/10/2012 |
10 | 3 | 3 | 03/08/2012 | 03/27/2012 |
11 | 3 | 2 | 12/12/2014 | 02/07/2015 |
This is how I want the new data set to be arranged:
Obs | ID | days | date_from | date_to |
1 | 1 | 18 | 01/05/2011 | 02/17/2011 |
2 | 1 | 5 | 04/09/2012 | 04/19/2012 |
3 | 2 | 14 | 01/11/2011 | 04/02/2011 |
4 | 2 | 5 | 04/10/2014 | 05/03/2014 |
5 | 3 | 12 | 02/15/2011 | 03/20/2011 |
6 | 3 | 13 | 01/16/2012 | 03/27/2012 |
7 | 3 | 2 | 12/12/2014 | 02/07/2015 |
(For ID 1 the two first treatment intervals are combined; for ID 2 the three first treatment intervals are combined; for ID 3 treatment intervals two and three are comined into one observation.)
Any suggestions how to write the code for this?
data test_1;
input ID days date_from : mmddyy12. Date_to : mmddyy12. ;
format date_from date_to mmddyy10.;
datalines;
1 10 01/05/2011 01/15/2011
1 8 02/07/2011 02/17/2011
1 5 04/09/2012 04/19/2012
2 9 01/11/2011 01/21/2011
2 2 01/31/2011 02/05/2011
2 3 02/21/2011 04/02/2011
2 5 04/10/2014 05/03/2014
3 12 02/15/2011 03/20/2011
3 10 01/16/2012 02/10/2012
3 3 03/08/2012 03/27/2012
3 2 12/12/2014 02/07/2015
;
run;
data temp;
set test_1;
by id;
dif=date_from-lag(date_to);
if first.id then call missing(dif);
run;
data temp1;
set temp;
by id;
if first.id or dif>30 then group+1;
run;
proc summary data=temp1;
by id group;
var days date_from date_to;
output out=want(drop=_:) sum(days)=days min(date_from)=date_from max(date_to)=date_to;
run;
data test_1;
input ID days date_from : mmddyy12. Date_to : mmddyy12. ;
format date_from date_to mmddyy10.;
datalines;
1 10 01/05/2011 01/15/2011
1 8 02/07/2011 02/17/2011
1 5 04/09/2012 04/19/2012
2 9 01/11/2011 01/21/2011
2 2 01/31/2011 02/05/2011
2 3 02/21/2011 04/02/2011
2 5 04/10/2014 05/03/2014
3 12 02/15/2011 03/20/2011
3 10 01/16/2012 02/10/2012
3 3 03/08/2012 03/27/2012
3 2 12/12/2014 02/07/2015
;
run;
data temp;
set test_1;
by id;
dif=date_from-lag(date_to);
if first.id then call missing(dif);
run;
data temp1;
set temp;
by id;
if first.id or dif>30 then group+1;
run;
proc summary data=temp1;
by id group;
var days date_from date_to;
output out=want(drop=_:) sum(days)=days min(date_from)=date_from max(date_to)=date_to;
run;
Thank you very much! Works perfectly!
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.