Contributor
Posts: 39

# How to calculate the difference between two timestamps of different obs

Hello, I have the following data set:

ID

StatusTimestamp

1122idle10MAY2013
1122active15MAY2013
1122idle25MAY2013
1122closed01JUN2013

Every time the status of an ID changes, a new record for that ID is being created and a timestamp of when the change occured is added to the record. That is how old statuses get historicized, as can be seen in the table.

I would like to add a column that measures how long a certain status has been assigned to an ID. The outcome would look like this:

ID

StatusTimestampDuration in days
1122idle10MAY20135
1122active15MAY201310
1122idle25MAY20137
1122closed01JUN2013...

The last value (the cell that contains "...") would be created through additional logic, like

if last.id then duration=today()-timestamp

or something like that.

Posts: 5,523

## Re: How to calculate the difference between two timestamps of different obs

You can do it like this:

data want;

set have; by id;

lastId = last.id;

if not endNext then

set have(firstobs=2 keep=timeStamp rename=(timeStamp=nextTimeStamp))

end=endNext;

if lastId

then duration= intck("DAY", timeStamp, today());

else duration = intck('DAY', timeStamp, nextTimeStamp);

drop lastId nextTimeStamp;

run;

PG

Message was edited by: PG : Added END= condition to prevent premature datastep ending.

PG
Posts: 3,167

## Re: How to calculate the difference between two timestamps of different obs

A proc sql way:

proc sql;

select a.*, b.timestamp-a.timestamp as duration

from have a

left join have b

on a.id=b.id

and a.timestamp < b.timestamp

group by a.id, a.status, a.timestamp

having b.timestamp=min(b.timestamp)

order by id,timestamp;

quit;

Haikuo

Super Contributor
Posts: 339

## Re: How to calculate the difference between two timestamps of different obs

If your dataset is already sorted by ID TIMESTAMP, then you can use the following:

data want;

set have;

by id;

if not last.id then do;

i=_N_+1;

set have(rename=(timestamp=nextstamp) keep=nextstamp) point=i;

duration=intck('day', timestamp, nextstamp);

end;

*drop nextstamp i;

run;

you may have to change the keep= set option to keep=timestamp. I always forget if rename applies before or after keep for set options. This effectively reads in the next TIMESTAMP into a new variable NEXTSTAMP using direct addressing of _N_+1 since your data is presorted.

edit the logic is very similar to that of PG

*edit corrected after PGs' comment. That is the kind of syntax restrictions that I always figure out debugging after my logic is established!

Vince

Posts: 5,523

## Re: How to calculate the difference between two timestamps of different obs

, that's a clever use of random access. However, expressions are not allowed as values to the POINT= option. You must use an intermediate variable to hold the expression.

PG

PG
Super Contributor
Posts: 391

## Re: How to calculate the difference between two timestamps of different obs

This involves double sorting, but if your data is not too big it might work....

data test;
length id 8 status \$10 timestamp 8;
input id status timestamp anydtdte.;
format timestamp yymmdd10.;
datalines;
1122 idle 10may2013
1122 active 15may2013
1122 idle 25may2013
1122 closed 01jun2013
;
run;

proc sort data=test;
by id descending timestamp;
run;

data new;
set test;
retain tmp;
format nextts yymmdd10.;

if _n_ = 1 then tmp = .;
nextts = tmp;

if timestamp ne . then tmp = timestamp;

duration=intck('day', timestamp, nextts);
run;

proc sort data=new;
by id timestamp;
run;

Discussion stats
• 5 replies
• 1514 views
• 0 likes
• 5 in conversation