Calcite | Level 5

## Calculations using dates on two separate data rows...

Hi,

I'm new to the forum, so I'm looking forward to not only getting some support from the community forums but also hoping to also provide some where possible.

I can't seem to figure out the following (without considering transposing the data in some way).  I'm trying to do some calculations using dates but the dates I'm working with are on two separate records for the same individual.   I suspect there is some lag function, but this is an area that I am unfamiliar with.

Here is what the data looks like:

ID     DATE1                   DATE2

1      2009-06-14       2009-08-11

1      2009-10-29

1      200912-31

2      2009-09-05       2009-11-10

2      2009-10-09

2      2010-05-23

and so on.....

The output I'm looking for is the following:

ID     DATE1                 DATE2            DAYS

1     2009-06-14       2009-08-11         79

1     2009-10-29

1     2009-12-31

2     2009-09-05       2009-11-10        194

2     2009-10-09

2     2010-05-23

and so on.....

So, for the first person (ID #1)  (LAG)DATE1 - DATE2 = 20091029 - 20090811 = 79 days.   So, no further calculations are required.

For the 2nd person (ID #2)  (LAG)DATE1 - DATE2 =  20091009 - 20091110 = -32 days.  When a negative "DAYS" is encountered, I would like to be able to check the next record for that person to see if there is a non-negative count.  So, it would loop to the 3rd record whereby (LAG)DATE1 - DATE2 = 20100523 - 20091110 = 194 days.

For illustrative purposes, I've included the actual dates, but I did convert the dates into numeric values using  NEWDATE1 = input (DATE1, yymmdd8.) to allow for the calculations.

Essentially, I would like one record per person which tells me when did the next event (DATE1) occur after DATE2.

Any help with this would be GREATLY appreciated.

M. Anthony

1 ACCEPTED SOLUTION

Accepted Solutions
Opal | Level 21

## Re: Calculations using dates on two separate data rows...

This will do it:

``````
proc sql;
create table want as
select
a.*,
case
when date2 is missing then .
else intck( "DAY", date2,
(select min(date1) from have where id=a.id and date1 > a.date2) )
end as days
from
have as a
order by id, date1;
quit;``````
PG
6 REPLIES 6
Tourmaline | Level 20

## Re: Calculations using dates on two separate data rows...

Since you need to access the information on the next row, I think you need to re-sort your data in descending order on date1.
The you can use lag () or RETAIN (with BY and first./last.-logic). Which to use is a matter of taste, and for me I find RETAIN easier to follow. There a tons of lag () and RETAIN examples on support.Sas.com.
Data never sleeps
Barite | Level 11

## Re: Calculations using dates on two separate data rows...

I agree, you can't look down and then pull up.  If you want the presentation you have I think you need to sort:

data have;
infile cards dsd;
informat d 5. date1 date2 yymmdd10.;
format d 5. date1 date2 yymmdd10.;
input D DATE1 DATE2;
cards;
1,2009-06-14,2009-08-11
1,2009-10-29,
1,2009-12-31,
2,2009-09-05,2009-11-10
2,2009-10-09,
2,2010-05-23,
;

proc sort data=have;by d descending date1;

data want;
set have;
by d descending date1;
days2 = lag3(date1);
DAYS = lag(date1)-date2;
if days < 0 and not missing(days) then days = date2-days2;
drop days2;
run;

proc sort data=want;by d date1;

Opal | Level 21

## Re: Calculations using dates on two separate data rows...

This will do it:

``````
proc sql;
create table want as
select
a.*,
case
when date2 is missing then .
else intck( "DAY", date2,
(select min(date1) from have where id=a.id and date1 > a.date2) )
end as days
from
have as a
order by id, date1;
quit;``````
PG
Lapis Lazuli | Level 10

## Re: Calculations using dates on two separate data rows...

If date1 of 3 or more later record in each ID is larger than date2, in such condition, you could try hash.

data want;
if 0 then set have(keep=id date1);
if _n_=1 then do;
declare hash h(dataset:'have(keep=id date1 rename=(date1=_date1))',multidata&colon;'y');
declare hiter hi('h');
h.definekey('id');
h.definedata('_date1');
h.definedone();
end;
set have;
by id notsorted;
if first.id then do;
rc=hi.setcur();
do while (rc=0);
if _date1>date2 then do;
temp=_date1;
goto day;
end;
rc=hi.next();
end;
day: day=temp-date2;
end;
drop temp _date1 rc;
run;

Calcite | Level 5

## Re: Calculations using dates on two separate data rows...

Hi,

Thank you for all the quick replies and for the various suggestions.  For my current purposes, the solution by PGStats seemed to be relatively straightfoward and did the trick.   The other solutions are certain to come in hand in the future, so they are definitely worth adding to my syntax list.

Thanks again for the help.