I have a data set like below.
PTID new_stdt new_end
P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011
I want to line up the dates .So that my new data looks like below.
PTID new_stdt new_end
P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        29-Dec-11
P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11
How can I do this ?Any leads will be appreciated.
Regards,
Aninda
What is the logic to go from have to want, I see no logical reason why row 1 would have 2nov11 and the second row would have 30nov11.
Hello.
Thanks for your reply.My objective is to make the dates consecutive with no gaps.
Regards,
Aninda
Am in a meeting so can't look too much, but I think your want data may be wrong. This code should get you near:
data have;
  informat new_stdt new_end date9.;
  input ptid $ new_stdt new_end;
  format new_stdt new_end date9.;
datalines;
P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011
;
run;
data inter;
  set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt));
run;
proc sort data=inter;
  by ptid new_stdt;
run;
data want;
  set inter (rename=(new_stdt=stdt));
  format new_stdt new_end date9.;
  by ptid;
  retain lstd;
  if first.ptid then lstd=stdt;
  else do;
    new_stdt=lstd;
    new_end=stdt-1;
    output;
  end;
  lstd=stdt;
run;
					
				
			
			
				
			
			
			
			
			
			
			
		Is the rule like that:
a) time ranges must not overlap
b) time ranges need to be split along calender months
?
Like @RW9, I think your expected output is wrong. I can't understand why your expected new data doesn't have 7 rows and include a row with new_stdt & new_end=29-Dec-11
PTID new_stdt new_end
P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        28-Dec-11
P3747440 29-Dec-11 29-Dec-11
P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11
I think basically you are taking all your uniques dates regardless of whether it's a start or end date, and then using those to create a new start_date. The corresponding end date is then equal to the next start date -1. Except for the last date, where it looks like you want new_end=new_stdt
In my attempt to code this I've reversed the date order because i'm comparing to the next date. It starts with @RW9's code as far as the INTER data table.
data have;
  informat new_stdt new_end date9.;
  input ptid $ new_stdt new_end;
  format new_stdt new_end date9.;
datalines;
P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011
;
run;
data inter;
  set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt));
run;
proc sort data=inter nodup;
 by ptid descending new_stdt;
run;
data want (keep= ptid new_stdt new_end);
format new_end date9.;
set inter;
retain lagdate ;
by ptid;
diff=lagdate-new_stdt;
if lagdate-new_stdt in (.,1) then new_end=new_stdt;else new_end=lagdate-1;
output;
if last.ptid then lagdate=. ;else lagdate=new_stdt ;
run;
proc sort data=want; by ptid new_stdt;run;
Like @RW9 , I think your expected data may be wrong. I don't see why it doesn't look like this.
PTID new_stdt new_end
P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        28-Dec-11
P3747440    29-Dec-11      29-Dec-11
P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11
I think you are taking every available date and creating a period between each date & the day before the next date.
Then on the last date, that is the start date and the end date.
@RW9code does this , but needs to include NODUP when sorting data table 'INTER' .
Also add a line right at the end of the 'DATA WANT' step
 ...
  if last.ptid then do;
  	new_end=stdt;
  	output; 
  end;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
