BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
AnimaCh44
Fluorite | Level 6

I have the following dataset.

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

 

I want to split the data by start date/ end date and group treatment based on the date range.

Expected output are as follows. 

State City           Stdt          endt          treatment
NY     NeyYork   1/1/2010 3/1/2010 Dialysis
MD    Baltimore 1/1/2011 1/31/2011 Dialysis
MD    Baltimore 2/1/2011 3/1/2011 Dialysis/Heart/Knee
MD    Baltimore 3/2/2011 5/1/2011 Dialysis/Heart/Knee/Urine
MD    Baltimore 5/2/2011 8/1/2011 Dialysis/Heart/Urine
MD    Baltimore 8/2/2011 12/1/2011 Dialysis/Heart/Urine
MD    Baltimore 12/2/2011 1/14/2015 Dialysis/Urine
MD    Baltimore 1/15/2015 2/14/2015 Urine

 

The logic is group all the treatment within that date range. First date range for Baltimore is 01/01/2011 to 08/01/2011 and treatment is Dialysis. Between 01/01/2011 to 08/01/2011 date range there are other start dates with different end dates and treatment like 02/01/2011, 03/02/2011, 04/15/2011, 02/01/2011.

Main logic is to group the treatment on specific date:- if we pick the random date 07/03/2011 for Baltimore then we have the Dialysis/Heart/Knee as treatment.

if we pick the random date as 09/01/2011 for Baltimore then we have the Heart/Urine/Dialysis as treatment.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

data temp;
 set have;
 do date=Stdt to endt;
  output;
 end;
 format date mmddyy10.;
 drop Stdt endt;
run;

proc sort data=temp nodupkey;by  State  city date Treatment;run;

data temp2;
 do until(last.date);
   set temp;
   by  State  city date;
   length Treatments $ 100;
   Treatments=catx('/',Treatments,Treatment);
 end;
 drop Treatment;
run;

data want;
 set temp2;
 by State  city Treatments notsorted;
 retain Stdt ;
 if first.Treatments then Stdt=date;
 if last.Treatments then do;endt=date;output;end;
 format Stdt endt mmddyy10.;
 drop date;
run;

Ksharp_0-1699237244927.png

 

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

Please explain the logic here. What are the steps that take you from the input data to the output data?

--
Paige Miller
AnimaCh44
Fluorite | Level 6

I have added the logic on main post.
The logic is group all the treatment within that date range. First date range for Baltimore is 01/01/2011 to 08/01/2011 and treatment is Dialysis. Between 01/01/2011 to 08/01/2011 date range there are other start dates with different end dates and treatment like 02/01/2011, 03/02/2011, 04/15/2011, 02/01/2011.

Main logic is to group the treatment on specific date:- if we pick the random date 07/03/2011 for Baltimore then we have the Dialysis/Heart/Knee as treatment.

if we pick the random date as 09/01/2011 for Baltimore then we have the Heart/Urine/Dialysis as treatment.

Ksharp
Super User
data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

data temp;
 set have;
 do date=Stdt to endt;
  output;
 end;
 format date mmddyy10.;
 drop Stdt endt;
run;

proc sort data=temp nodupkey;by  State  city date Treatment;run;

data temp2;
 do until(last.date);
   set temp;
   by  State  city date;
   length Treatments $ 100;
   Treatments=catx('/',Treatments,Treatment);
 end;
 drop Treatment;
run;

data want;
 set temp2;
 by State  city Treatments notsorted;
 retain Stdt ;
 if first.Treatments then Stdt=date;
 if last.Treatments then do;endt=date;output;end;
 format Stdt endt mmddyy10.;
 drop date;
run;

Ksharp_0-1699237244927.png

 

AnimaCh44
Fluorite | Level 6

Thank you for the solution. 

AnimaCh44
Fluorite | Level 6

The solution is fine for the overlap dates. If we have dates without overlap like for New York, we got the start date as 01/01/2010 and the end date as 03/01/2011. But we do not have any program from 03/02/2010 to 12/31/2010 and there should be a separate line for this. 

 

I have added one more line for New York on the previous dataset.

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;

format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis

NY NeyYork 01/01/2011 03/01/2011 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

 

I want to split the data by start date/ end date and group treatment based on the date range.

Expected output are as follows.

State City Stdt endt treatment
NY NeyYork 1/1/2010 3/1/2010 Dialysis
NY NeyYork 1/1/2011 3/1/2011 Dialysis
MD Baltimore 1/1/2011 1/31/2011 Dialysis
MD Baltimore 2/1/2011 3/1/2011 Dialysis/Heart/Knee
MD Baltimore 3/2/2011 5/1/2011 Dialysis/Heart/Knee/Urine
MD Baltimore 5/2/2011 12/1/2011 Dialysis/Heart/Urine
MD Baltimore 12/2/2011 1/14/2015 Dialysis/Urine
MD Baltimore 1/15/2015 2/14/2015 Urine

 

The logic is group all the treatment within that date range. First date range for Baltimore is 01/01/2011 to 08/01/2011 and treatment is Dialysis. Between 01/01/2011 to 08/01/2011 date range there are other start dates with different end dates and treatment like 02/01/2011, 03/02/2011, 04/15/2011, 02/01/2011.

Main logic is to group the treatment on specific date:- if we pick the random date 07/03/2011 for Baltimore then we have the Dialysis/Heart/Knee as treatment.

if we pick the random date as 09/01/2011 for Baltimore then we have the Heart/Urine/Dialysis as treatment.

Ksharp
Super User

OK. I did not take into account of this scenario .

Try this one :

 

data have;
input State $ city $ (Stdt endt)(:mmddyy10.) Treatment $;
format Stdt endt mmddyy10.;
datalines;
NY NeyYork 01/01/2010 03/01/2010 Dialysis
NY NeyYork 01/01/2011 03/01/2011 Dialysis
MD Baltimore 01/01/2011 08/01/2011 Dialysis
MD Baltimore 02/01/2011 12/01/2011 Heart
MD Baltimore 03/02/2011 02/14/2015 Urine
MD Baltimore 04/15/2011 01/14/2015 Dialysis
MD Baltimore 02/01/2011 05/01/2011 Knee
;
run;

data temp;
 set have;
 do date=Stdt to endt;
  output;
 end;
 format date mmddyy10.;
 drop Stdt endt;
run;

proc sort data=temp nodupkey;by  State  city date Treatment;run;

data temp2;
 do until(last.date);
   set temp;
   by  State  city date;
   length Treatments $ 100;
   Treatments=catx('/',Treatments,Treatment);
 end;
 drop Treatment;
run;

data temp3;
 set temp2;
 by State  city Treatments notsorted;
 if first.Treatments or dif(date) ne 1 then group+1;
run;
data want;
 set temp3;
 by group;
 retain Stdt ;
 if first.group then Stdt=date;
 if last.group then do;endt=date;output;end;
 format Stdt endt mmddyy10.;
 drop date;
run;

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Autotuning Deep Learning Models Using SAS

Follow along as SAS’ Robert Blanchard explains three aspects of autotuning in a deep learning context: globalized search, localized search and an in parallel method using SAS.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 5302 views
  • 2 likes
  • 3 in conversation