DATA Step, Macro, Functions and more

How to line up the dates

Reply
New Contributor
Posts: 4

How to line up the dates

[ Edited ]

I have a data set like below.

 

PTID            new_stdt         new_end

P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011

 

I want to line up the dates .So that my new data looks like below.

 

PTID            new_stdt         new_end

P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        29-Dec-11
P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11

 

How can I do this ?Any leads will be appreciated.

 

Regards,

Aninda

Super User
Super User
Posts: 7,400

Re: How to line up the dates

What is the logic to go from have to want, I see no logical reason why row 1 would have 2nov11 and the second row would have 30nov11.

New Contributor
Posts: 4

Re: How to line up the dates

[ Edited ]

Hello.

Thanks for your reply.My objective is to make the dates consecutive with no gaps.

Regards,

Aninda

Super User
Super User
Posts: 7,400

Re: How to line up the dates

Am in a meeting so can't look too much, but I think your want data may be wrong.  This code should get you near:

data have;
  informat new_stdt new_end date9.;
  input ptid $ new_stdt new_end;
  format new_stdt new_end date9.;
datalines;
P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011
;
run;

data inter;
  set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt));
run;
proc sort data=inter;
  by ptid new_stdt;
run;
data want;
  set inter (rename=(new_stdt=stdt));
  format new_stdt new_end date9.;
  by ptid;
  retain lstd;
  if first.ptid then lstd=stdt;
  else do;
    new_stdt=lstd;
    new_end=stdt-1;
    output;
  end;
  lstd=stdt;
run;
Super User
Posts: 6,932

Re: How to line up the dates

Is the rule like that:

a) time ranges must not overlap

b) time ranges need to be split along calender months

?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 39

Re: How to line up the dates

Hi @aninda_metal

Like @RW9, I think your expected output is wrong. I  can't understand why your expected new data doesn't have 7 rows and include a row with new_stdt & new_end=29-Dec-11

 

PTID            new_stdt         new_end

P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        28-Dec-11

P3747440    29-Dec-11      29-Dec-11

P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11

 

I think basically you are taking all your uniques dates regardless of whether it's a start or end date, and then using those to create a new start_date. The corresponding end date is then equal to the next start date -1. Except for the last date, where it looks like you want new_end=new_stdt

 

In my attempt to code this I've reversed the date order because i'm comparing to the next date. It starts with @RW9's code as far as the INTER data table.

 

data have;
  informat new_stdt new_end date9.;
  input ptid $ new_stdt new_end;
  format new_stdt new_end date9.;
datalines;
P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011
;
run;

data inter;
  set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt));
run;

proc sort data=inter nodup;
by ptid descending new_stdt;
run;

data want (keep= ptid new_stdt new_end);
format new_end date9.;
set inter;
retain lagdate ;
by ptid;
diff=lagdate-new_stdt;
if lagdate-new_stdt in (.,1) then new_end=new_stdt;else new_end=lagdate-1;
output;
if last.ptid then lagdate=. ;else lagdate=new_stdt ;
run;
proc sort data=want; by ptid new_stdt;run;

 

 

Contributor
Posts: 39

Re: How to line up the dates

Like @RW9 , I think your expected data may be wrong. I don't see why  it doesn't look like this.

PTID            new_stdt         new_end

P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        28-Dec-11

P3747440    29-Dec-11      29-Dec-11
P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11

 

I think you are taking every available date and creating a period between each date & the day before the next date.

Then on the last date, that is the start date and the end date.

 

 @RW9code does this , but needs to include NODUP when sorting data table 'INTER' . 

Also add a line right at the end of the 'DATA WANT' step

 ...
  if last.ptid then do;
  	new_end=stdt;
  	output; 
  end;
run;

 

Ask a Question
Discussion stats
  • 6 replies
  • 368 views
  • 1 like
  • 4 in conversation