BookmarkSubscribeRSS Feed
aninda_metal
Calcite | Level 5

I have a data set like below.

 

PTID            new_stdt         new_end

P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011

 

I want to line up the dates .So that my new data looks like below.

 

PTID            new_stdt         new_end

P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        29-Dec-11
P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11

 

How can I do this ?Any leads will be appreciated.

 

Regards,

Aninda

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

What is the logic to go from have to want, I see no logical reason why row 1 would have 2nov11 and the second row would have 30nov11.

aninda_metal
Calcite | Level 5

Hello.

Thanks for your reply.My objective is to make the dates consecutive with no gaps.

Regards,

Aninda

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Am in a meeting so can't look too much, but I think your want data may be wrong.  This code should get you near:

data have;
  informat new_stdt new_end date9.;
  input ptid $ new_stdt new_end;
  format new_stdt new_end date9.;
datalines;
P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011
;
run;

data inter;
  set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt));
run;
proc sort data=inter;
  by ptid new_stdt;
run;
data want;
  set inter (rename=(new_stdt=stdt));
  format new_stdt new_end date9.;
  by ptid;
  retain lstd;
  if first.ptid then lstd=stdt;
  else do;
    new_stdt=lstd;
    new_end=stdt-1;
    output;
  end;
  lstd=stdt;
run;
JohnHoughton
Quartz | Level 8

Hi @aninda_metal

Like @RW9, I think your expected output is wrong. I  can't understand why your expected new data doesn't have 7 rows and include a row with new_stdt & new_end=29-Dec-11

 

PTID            new_stdt         new_end

P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        28-Dec-11

P3747440    29-Dec-11      29-Dec-11

P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11

 

I think basically you are taking all your uniques dates regardless of whether it's a start or end date, and then using those to create a new start_date. The corresponding end date is then equal to the next start date -1. Except for the last date, where it looks like you want new_end=new_stdt

 

In my attempt to code this I've reversed the date order because i'm comparing to the next date. It starts with @RW9's code as far as the INTER data table.

 

data have;
  informat new_stdt new_end date9.;
  input ptid $ new_stdt new_end;
  format new_stdt new_end date9.;
datalines;
P3747440   02Nov2011     29Dec2011
P3747440   03Nov2011     02Dec2011
P3747440   01Dec2011    30Dec2011
P3747440   30Dec2011    31Dec2011
;
run;

data inter;
  set have (keep=ptid new_stdt) have (keep=ptid new_end rename=(new_end=new_stdt));
run;

proc sort data=inter nodup;
by ptid descending new_stdt;
run;

data want (keep= ptid new_stdt new_end);
format new_end date9.;
set inter;
retain lagdate ;
by ptid;
diff=lagdate-new_stdt;
if lagdate-new_stdt in (.,1) then new_end=new_stdt;else new_end=lagdate-1;
output;
if last.ptid then lagdate=. ;else lagdate=new_stdt ;
run;
proc sort data=want; by ptid new_stdt;run;

 

 

JohnHoughton
Quartz | Level 8

Like @RW9 , I think your expected data may be wrong. I don't see why  it doesn't look like this.

PTID            new_stdt         new_end

P3747440    2-Nov-11        2-Nov-11
P3747440    3-Nov-11        30-Nov-11
P3747440    1-Dec-11        1-Dec-11
P3747440    2-Dec-11        28-Dec-11

P3747440    29-Dec-11      29-Dec-11
P3747440    30-Dec-11      30-Dec-11
P3747440    31-Dec-11      31-Dec-11

 

I think you are taking every available date and creating a period between each date & the day before the next date.

Then on the last date, that is the start date and the end date.

 

 @RW9code does this , but needs to include NODUP when sorting data table 'INTER' . 

Also add a line right at the end of the 'DATA WANT' step

 ...
  if last.ptid then do;
  	new_end=stdt;
  	output; 
  end;
run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1278 views
  • 1 like
  • 4 in conversation