DATA Step, Macro, Functions and more

Need help on comparing end dates with start date in second line

Reply
Contributor
Posts: 70

Need help on comparing end dates with start date in second line

Dear All..

I have huge data. so Im just taking example and explaining the situation. 

I have data like below

ID   name logline stdtc endtc

101  a 1 01-jan-2013 03-jan-2013

101  a 2 05-jan-2013 06-jan-2013

101  b 1 01-jan-2013 03-jan-2013

101  b 2 05-jan-2013 06-jan-2013

101  c 1 01-jan-2013 03-jan-2013

101  c 2 05-jan-2013 06-jan-2013

102  a 1 01-jan-2013 03-jan-2013

102  a 2 04-jan-2013 06-jan-2013

102  b 1 01-jan-2013 03-jan-2013

102  b 2 04-jan-2013 06-jan-2013

102  b 3 07-jan-2013 10-jan-2013

In my output There should be gap between the end date and start date of the next log line. which means for the id 101 end date is like "03-jan-2013" and log line 2 start date is "05-jan-2013" for the name "a". in the scenario there is gap(one day) in between this. So i should need in the output as if there is a gap as mentioned.

output need to get like below

OUTPUT:

101  a 1 01-jan-2013 03-jan-2013

101  a 2 05-jan-2013 06-jan-2013

101  b 1 01-jan-2013 03-jan-2013

101  b 2 05-jan-2013 06-jan-2013

101  c 1 01-jan-2013 03-jan-2013

101  c 2 05-jan-2013 06-jan-2013

With Best Regards,

Yaswanth J.

Super User
Posts: 19,820

Re: Need help on comparing end dates with start date in second line

Posted in reply to yaswanthj

Will it be by ID and Name or just by ID?

You can use the lag function, here's some untested code.

data want;

set have;

by id name;

prev_date=lag(endtc);

if first.id then prev_date=.;

if start_date-prev_date <1 ....;

run;

Contributor
Posts: 70

Re: Need help on comparing end dates with start date in second line

Hi Reeza,

It Will be Name and group by ID .

Thanks, Yaswanth J.

Contributor
Posts: 70

Re: Need help on comparing end dates with start date in second line

Hi DF

Thanks For your input.

I could not able to get the output wd your code.

Thanks and Regards,

Yaswanth

Frequent Contributor
Frequent Contributor
Posts: 94

Re: Need help on comparing end dates with start date in second line

Posted in reply to yaswanthj

The below might work for you.  It's built on a few assumptions - specifically that ID, Name and Logline uniquely identify a row, and are always in order in your input file, and that Logline is sequential, starts from 1 each time ID or Name changes, and has no gaps in the sequence.

Basically, uses retain to keep track of the previous end date (similar to Reeza's suggestion with lag), and then uses that to identify the second row in each pairing.  Given the above assumption, this can never happen when Logline = 1, so it outputs two lines, with Loglines - 1, and then with Logline as is.  This output is then used as a filter, via a merge.

I'd be interested to see if there's a way of doing this in one step (without resorting to SQL).

data have;

infile datalines dsd dlm=' ';

length ID 8.;

format name $1.;

length logline 8.;

format stdtc ddmmyys10.;

format endtc ddmmyys10.;

input ID name $ logline stdtc:date. endtc:date.;

datalines;

101 a 1 01-jan-2013 03-jan-2013

101 a 2 05-jan-2013 06-jan-2013

101 b 1 01-jan-2013 03-jan-2013

101 b 2 05-jan-2013 06-jan-2013

101 c 1 01-jan-2013 03-jan-2013

101 c 2 05-jan-2013 06-jan-2013

102 a 1 01-jan-2013 03-jan-2013

102 a 2 04-jan-2013 06-jan-2013

102 b 1 01-jan-2013 03-jan-2013

102 b 2 04-jan-2013 06-jan-2013

102 b 3 07-jan-2013 10-jan-2013

;

run;

data have2/view=have2;

set have;

by ID name logline; /*the input has to be sorted by this*/

format previous_endtc ddmmyys10.;

retain previous_endtc;

keep ID name logline;

if first.name then call missing(previous_endtc);

else if previous_endtc + 1 ~= stdtc then do;

    logline = logline - 1;

    output;

    logline = logline + 1;

    output;

end;

previous_endtc = endtc;

run;

data want;

merge

have (in=h1)

have2 (in=h2)

;

by ID name logline;

if h2 then output;

run;

Frequent Contributor
Posts: 106

Re: Need help on comparing end dates with start date in second line

Posted in reply to yaswanthj

Hi Yaswanth,

i don't see any difference in your input and output(correct me if i am wrong Smiley Happy )

can you explain bit more? what are we trying to achieve here..

Ask a Question
Discussion stats
  • 5 replies
  • 315 views
  • 0 likes
  • 4 in conversation