Solved: Re: How to split date range with startdate / enddate and combine progr...

AnimaCh44 · Posted 11-04-2023 09:50 AM

I have the following dataset.

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

I want to split the data by start date/ end date and group treatment based on the date range.

Expected output are as follows.

State City Stdt endt treatment
NY NeyYork 1/1/2010 3/1/2010 Dialysis
MD Baltimore 1/1/2011 1/31/2011 Dialysis
MD Baltimore 2/1/2011 3/1/2011 Dialysis/Heart/Knee
MD Baltimore 3/2/2011 5/1/2011 Dialysis/Heart/Knee/Urine
MD Baltimore 5/2/2011 8/1/2011 Dialysis/Heart/Urine
MD Baltimore 8/2/2011 12/1/2011 Dialysis/Heart/Urine
MD Baltimore 12/2/2011 1/14/2015 Dialysis/Urine
MD Baltimore 1/15/2015 2/14/2015 Urine

The logic is group all the treatment within that date range. First date range for Baltimore is 01/01/2011 to 08/01/2011 and treatment is Dialysis. Between 01/01/2011 to 08/01/2011 date range there are other start dates with different end dates and treatment like 02/01/2011, 03/02/2011, 04/15/2011, 02/01/2011.

Main logic is to group the treatment on specific date:- if we pick the random date 07/03/2011 for Baltimore then we have the Dialysis/Heart/Knee as treatment.

if we pick the random date as 09/01/2011 for Baltimore then we have the Heart/Urine/Dialysis as treatment.

Ksharp · Posted 11-05-2023 09:20 PM

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

data temp;
 set have;
 do date=Stdt to endt;
  output;
 end;
 format date mmddyy10.;
 drop Stdt endt;
run;

proc sort data=temp nodupkey;by  State  city date Treatment;run;

data temp2;
 do until(last.date);
   set temp;
   by  State  city date;
   length Treatments $ 100;
   Treatments=catx('/',Treatments,Treatment);
 end;
 drop Treatment;
run;

data want;
 set temp2;
 by State  city Treatments notsorted;
 retain Stdt ;
 if first.Treatments then Stdt=date;
 if last.Treatments then do;endt=date;output;end;
 format Stdt endt mmddyy10.;
 drop date;
run;

View solution in original post

PaigeMiller · Posted 11-04-2023 10:33 AM

Please explain the logic here. What are the steps that take you from the input data to the output data?

--
Paige Miller

AnimaCh44 · Posted 11-04-2023 10:57 AM

I have added the logic on main post.
The logic is group all the treatment within that date range. First date range for Baltimore is 01/01/2011 to 08/01/2011 and treatment is Dialysis. Between 01/01/2011 to 08/01/2011 date range there are other start dates with different end dates and treatment like 02/01/2011, 03/02/2011, 04/15/2011, 02/01/2011.

Main logic is to group the treatment on specific date:- if we pick the random date 07/03/2011 for Baltimore then we have the Dialysis/Heart/Knee as treatment.

if we pick the random date as 09/01/2011 for Baltimore then we have the Heart/Urine/Dialysis as treatment.

Ksharp · Posted 11-05-2023 09:20 PM

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

data temp;
 set have;
 do date=Stdt to endt;
  output;
 end;
 format date mmddyy10.;
 drop Stdt endt;
run;

proc sort data=temp nodupkey;by  State  city date Treatment;run;

data temp2;
 do until(last.date);
   set temp;
   by  State  city date;
   length Treatments $ 100;
   Treatments=catx('/',Treatments,Treatment);
 end;
 drop Treatment;
run;

data want;
 set temp2;
 by State  city Treatments notsorted;
 retain Stdt ;
 if first.Treatments then Stdt=date;
 if last.Treatments then do;endt=date;output;end;
 format Stdt endt mmddyy10.;
 drop date;
run;

AnimaCh44 · Posted 11-07-2023 08:24 AM

Thank you for the solution.

AnimaCh44 · Posted 11-07-2023 10:14 PM

The solution is fine for the overlap dates. If we have dates without overlap like for New York, we got the start date as 01/01/2010 and the end date as 03/01/2011. But we do not have any program from 03/02/2010 to 12/31/2010 and there should be a separate line for this.

I have added one more line for New York on the previous dataset.

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis

NY NeyYork 01/01/2011 03/01/2011 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

I want to split the data by start date/ end date and group treatment based on the date range.

Expected output are as follows.

State City Stdt endt treatment
NY NeyYork 1/1/2010 3/1/2010 Dialysis
NY NeyYork 1/1/2011 3/1/2011 Dialysis
MD Baltimore 1/1/2011 1/31/2011 Dialysis
MD Baltimore 2/1/2011 3/1/2011 Dialysis/Heart/Knee
MD Baltimore 3/2/2011 5/1/2011 Dialysis/Heart/Knee/Urine
MD Baltimore 5/2/2011 12/1/2011 Dialysis/Heart/Urine
MD Baltimore 12/2/2011 1/14/2015 Dialysis/Urine
MD Baltimore 1/15/2015 2/14/2015 Urine

The logic is group all the treatment within that date range. First date range for Baltimore is 01/01/2011 to 08/01/2011 and treatment is Dialysis. Between 01/01/2011 to 08/01/2011 date range there are other start dates with different end dates and treatment like 02/01/2011, 03/02/2011, 04/15/2011, 02/01/2011.

Main logic is to group the treatment on specific date:- if we pick the random date 07/03/2011 for Baltimore then we have the Dialysis/Heart/Knee as treatment.

if we pick the random date as 09/01/2011 for Baltimore then we have the Heart/Urine/Dialysis as treatment.

Ksharp · Posted 11-08-2023 12:39 AM

OK. I did not take into account of this scenario .

Try this one :

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;
format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
NY NeyYork 01/01/2011 03/01/2011 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

data temp;
 set have;
 do date=Stdt to endt;
  output;
 end;
 format date mmddyy10.;
 drop Stdt endt;
run;

proc sort data=temp nodupkey;by  State  city date Treatment;run;

data temp2;
 do until(last.date);
   set temp;
   by  State  city date;
   length Treatments $ 100;
   Treatments=catx('/',Treatments,Treatment);
 end;
 drop Treatment;
run;

data temp3;
 set temp2;
 by State  city Treatments notsorted;
 if first.Treatments or dif(date) ne 1 then group+1;
run;
data want;
 set temp3;
 by group;
 retain Stdt ;
 if first.group then Stdt=date;
 if last.group then do;endt=date;output;end;
 format Stdt endt mmddyy10.;
 drop date;
run;

How to split date range with startdate / enddate and combine program?

Re: How to split date range with startdate / enddate and combine program?

Re: How to split date range with startdate / enddate and combine program?

Re: How to split date range with startdate / enddate and combine program?

Re: How to split date range with startdate / enddate and combine program?

Re: How to split date range with startdate / enddate and combine program?

Re: How to split date range with startdate / enddate and combine program?

Re: How to split date range with startdate / enddate and combine program?

Registration is open