DATA Step, Macro, Functions and more

create a new date variable conditional on other dates

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 128
Accepted Solution

create a new date variable conditional on other dates

start date  drug end_Date 
01/01/2005 a     02/01/2005
02/02/2005 a    03/01/2005
03/08/2005  a      04/08/2005
06/01/2005  a     07/01/2005

I want to create a dataset were I would have one start date and end date based on a gap of no longer than 7 days between the end of the first period and the start of the second period. For the example above, I would merge all the first three and ignore the four

output 

start date  drug end_Date

01/01/2005    a     04/08/2005


Accepted Solutions
Solution
‎11-05-2016 06:35 AM
Super User
Posts: 10,048

Re: create a new date variable conditional on other dates

Posted in reply to lillymaginta

data one;
informat start_date end_date mmddyy.;
input start_date drug $1. end_Date ;
datalines;
01/01/2005 a 02/01/2005
02/02/2005 a 03/01/2005
03/08/2005 a 04/08/2005
06/01/2005 a 07/01/2005
;
run;
data temp;
 set one;
 by drug;
 if first.drug then group=0;
 if first.drug or start_date-lag(end_date) gt 7 then group+1;
 if group=1;
run;
data want;
 set temp;
 by drug;
 retain start;
 if first.drug then start=start_date;
 if last.drug then output;
 drop group start_date;
 format start end_date date9.;
run;
  
 

View solution in original post


All Replies
Super Contributor
Posts: 279

Re: create a new date variable conditional on other dates

Posted in reply to lillymaginta

data one;
informat start_date end_date mmddyy.;
input start_date drug $1. end_Date ;
datalines;
01/01/2005 a 02/01/2005
02/02/2005 a 03/01/2005
03/08/2005 a 04/08/2005
06/01/2005 a 07/01/2005
;
run;
proc sort;
by drug start_date end_date;
run;
data two;
set one;
by drug;
if first.drug then newstartdate=start_date;
if last.drug then newenddate=end_date;
proc print;
run;
data _start;
set two;
if newstartdate ne .;
data _end;
set two;
if newenddate ne .;
run;
proc sql;
create table join1 as select
s.drug, s.newstartdate, e.newenddate from _start as s
join _end as e
on s.drug=e.drug;
quit;
proc print;
format newstartdate newenddate mmddyy.;
run;

Frequent Contributor
Posts: 128

Re: create a new date variable conditional on other dates

Hi Thank you for the reply. Your code merge all the days together and does not produce the output above. The last entry for the date should not be counted because the end_Date for the period before and the start date of the second period have a gap of more than 7 days. so if the gap is more than 7 days, record the end date for the one before which is 04/08/2005. Can you please adjust your code to account for that? 

Trusted Advisor
Posts: 1,586

Re: create a new date variable conditional on other dates

Posted in reply to lillymaginta

data have;

infile cards;

input start_date ddmmyy10. drug $1. end_date ddmmyy10.;

cards;

start_date drug end_Date 
01/01/2005 a 02/01/2005
02/02/2005 a 03/01/2005
03/08/2005 a 04/08/2005
06/01/2005 a 07/01/2005

; run; 

 

proc sort data=have; by drug start_date; run;

data want(rename = (date1 = start_date date2 = end_date));

  set have;  

    by drug;

       retain date1 date2;

       if first.drug the do;

          date1 = start_date;

          date2 = end_date;

      end; 

      else 

      if start_date = date2 + 1 then date2 = end_date;

      else do;              

              output;

              date1 = start_date;

              date2 = end_date;

              return;

      end; 

      if last.drug then output;

      keep date1 date2;

run;

 

I have not run my code. In case of any difficulty please post the log, to help you more. 

                 

Frequent Contributor
Posts: 128

Re: create a new date variable conditional on other dates

Thank you for posting but you code does not produce the required output 

Solution
‎11-05-2016 06:35 AM
Super User
Posts: 10,048

Re: create a new date variable conditional on other dates

Posted in reply to lillymaginta

data one;
informat start_date end_date mmddyy.;
input start_date drug $1. end_Date ;
datalines;
01/01/2005 a 02/01/2005
02/02/2005 a 03/01/2005
03/08/2005 a 04/08/2005
06/01/2005 a 07/01/2005
;
run;
data temp;
 set one;
 by drug;
 if first.drug then group=0;
 if first.drug or start_date-lag(end_date) gt 7 then group+1;
 if group=1;
run;
data want;
 set temp;
 by drug;
 retain start;
 if first.drug then start=start_date;
 if last.drug then output;
 drop group start_date;
 format start end_date date9.;
run;
  
 

Frequent Contributor
Posts: 128

Re: create a new date variable conditional on other dates

Thank you Ksharp, always a life saver! 

Frequent Contributor
Posts: 128

Re: create a new date variable conditional on other dates

Posted in reply to lillymaginta
if you have multiple ids, and you want to run the above code and generate the above output at the same time for multiple ids, how do you adjust your code for that?
Frequent Contributor
Posts: 128

Re: create a new date variable conditional on other dates

Posted in reply to lillymaginta
data one;
informat start_date end_date mmddyy.;
input id start_date drug $1. end_Date ;
datalines;
1  01/01/2005 a 02/01/2005
1  02/02/2005 a 03/01/2005
1  03/08/2005 a 04/08/2005
1  06/01/2005 a 07/01/2005
2  01/01/2005 a 02/01/2005
2  02/02/2005 a 03/01/2005
2  03/08/2005 a 04/08/2005
2 06/01/2005 a 07/01/2005
3  01/01/2005 a 02/02/2005


;
run;

also if the patient has one period with no other start date within 7 days (such as in patient 3), the code should assign that period as the start and end date for the output. 

Thank you! 

Super User
Posts: 10,048

Re: create a new date variable conditional on other dates

Posted in reply to lillymaginta

data one;
informat start_date end_date mmddyy.;
input id start_date drug $1. end_Date ;
datalines;
1  01/01/2005 a 02/01/2005
1  02/02/2005 a 03/01/2005
1  03/08/2005 a 04/08/2005
1  06/01/2005 a 07/01/2005
2  01/01/2005 a 02/01/2005
2  02/02/2005 a 03/01/2005
2  03/08/2005 a 04/08/2005
2 06/01/2005 a 07/01/2005
3  01/01/2005 a 02/02/2005
;
run;


data temp;
 set one;
 by drug id;
 if first.id then group=0;
 if first.id or start_date-lag(end_date) gt 7 then group+1;
 if group=1;
run;
data want;
 set temp;
 by drug id;
 retain start;
 if first.id then start=start_date;
 if last.id then output;
 drop group start_date;
 format start end_date date9.;
run;
  
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 281 views
  • 1 like
  • 4 in conversation