I have a data set like below.
PTID new_stdt new_end
P3747440 02Nov2011 29Dec2011
P3747440 03Nov2011 02Dec2011
P3747440 01Dec2011 30Dec2011
P3747440 30Dec2011 31Dec2011
I want to line up the dates .So that my new data looks like below.
PTID new_stdt new_end
P3747440 2-Nov-11 2-Nov-11
P3747440 3-Nov-11 30-Nov-11
P3747440 1-Dec-11 1-Dec-11
P3747440 2-Dec-11 29-Dec-11
P3747440 30-Dec-11 30-Dec-11
P3747440 31-Dec-11 31-Dec-11
How can I do this ?Any leads will be appreciated.
Regards,
Aninda
What is the logic to go from have to want, I see no logical reason why row 1 would have 2nov11 and the second row would have 30nov11.
Hello.
Thanks for your reply.My objective is to make the dates consecutive with no gaps.
Regards,
Aninda
Am in a meeting so can't look too much, but I think your want data may be wrong. This code should get you near:
data have; informat new_stdt new_end date9.; input ptid $ new_stdt new_end; format new_stdt new_end date9.; datalines; P3747440 02Nov2011 29Dec2011 P3747440 03Nov2011 02Dec2011 P3747440 01Dec2011 30Dec2011 P3747440 30Dec2011 31Dec2011 ; run; data inter; set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt)); run; proc sort data=inter; by ptid new_stdt; run; data want; set inter (rename=(new_stdt=stdt)); format new_stdt new_end date9.; by ptid; retain lstd; if first.ptid then lstd=stdt; else do; new_stdt=lstd; new_end=stdt-1; output; end; lstd=stdt; run;
Is the rule like that:
a) time ranges must not overlap
b) time ranges need to be split along calender months
?
Like @RW9, I think your expected output is wrong. I can't understand why your expected new data doesn't have 7 rows and include a row with new_stdt & new_end=29-Dec-11
PTID new_stdt new_end
P3747440 2-Nov-11 2-Nov-11
P3747440 3-Nov-11 30-Nov-11
P3747440 1-Dec-11 1-Dec-11
P3747440 2-Dec-11 28-Dec-11
P3747440 29-Dec-11 29-Dec-11
P3747440 30-Dec-11 30-Dec-11
P3747440 31-Dec-11 31-Dec-11
I think basically you are taking all your uniques dates regardless of whether it's a start or end date, and then using those to create a new start_date. The corresponding end date is then equal to the next start date -1. Except for the last date, where it looks like you want new_end=new_stdt
In my attempt to code this I've reversed the date order because i'm comparing to the next date. It starts with @RW9's code as far as the INTER data table.
data have;
informat new_stdt new_end date9.;
input ptid $ new_stdt new_end;
format new_stdt new_end date9.;
datalines;
P3747440 02Nov2011 29Dec2011
P3747440 03Nov2011 02Dec2011
P3747440 01Dec2011 30Dec2011
P3747440 30Dec2011 31Dec2011
;
run;
data inter;
set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt));
run;
proc sort data=inter nodup;
by ptid descending new_stdt;
run;
data want (keep= ptid new_stdt new_end);
format new_end date9.;
set inter;
retain lagdate ;
by ptid;
diff=lagdate-new_stdt;
if lagdate-new_stdt in (.,1) then new_end=new_stdt;else new_end=lagdate-1;
output;
if last.ptid then lagdate=. ;else lagdate=new_stdt ;
run;
proc sort data=want; by ptid new_stdt;run;
Like @RW9 , I think your expected data may be wrong. I don't see why it doesn't look like this.
PTID new_stdt new_end
P3747440 2-Nov-11 2-Nov-11
P3747440 3-Nov-11 30-Nov-11
P3747440 1-Dec-11 1-Dec-11
P3747440 2-Dec-11 28-Dec-11
P3747440 29-Dec-11 29-Dec-11
P3747440 30-Dec-11 30-Dec-11
P3747440 31-Dec-11 31-Dec-11
I think you are taking every available date and creating a period between each date & the day before the next date.
Then on the last date, that is the start date and the end date.
@RW9code does this , but needs to include NODUP when sorting data table 'INTER' .
Also add a line right at the end of the 'DATA WANT' step
...
if last.ptid then do;
new_end=stdt;
output;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.