DATA Step, Macro, Functions and more

How to calculate the difference between two timestamps of different obs

Reply
Contributor
Posts: 39

How to calculate the difference between two timestamps of different obs

Hello, I have the following data set:

ID

StatusTimestamp


1122idle10MAY2013
1122active15MAY2013
1122idle25MAY2013
1122closed01JUN2013

Every time the status of an ID changes, a new record for that ID is being created and a timestamp of when the change occured is added to the record. That is how old statuses get historicized, as can be seen in the table.

I would like to add a column that measures how long a certain status has been assigned to an ID. The outcome would look like this:

ID

StatusTimestampDuration in days
1122idle10MAY20135
1122active15MAY201310
1122idle25MAY20137
1122closed01JUN2013...

The last value (the cell that contains "...") would be created through additional logic, like

if last.id then duration=today()-timestamp

or something like that.

Your help is much appreciated!

Respected Advisor
Posts: 4,659

Re: How to calculate the difference between two timestamps of different obs

You can do it like this:

data want;

set have; by id;

lastId = last.id;

if not endNext then

     set have(firstobs=2 keep=timeStamp rename=(timeStamp=nextTimeStamp))

          end=endNext;

if lastId

     then duration= intck("DAY", timeStamp, today());    

     else duration = intck('DAY', timeStamp, nextTimeStamp);

drop lastId nextTimeStamp;

run;

PG

Message was edited by: PG : Added END= condition to prevent premature datastep ending.

PG
Respected Advisor
Posts: 3,124

Re: How to calculate the difference between two timestamps of different obs

A proc sql way:

proc sql;

  select a.*, b.timestamp-a.timestamp as duration

  from have a

left join have b

on a.id=b.id

and a.timestamp < b.timestamp

group by a.id, a.status, a.timestamp

having b.timestamp=min(b.timestamp)

  order by id,timestamp;

  quit;

Haikuo

Super Contributor
Posts: 339

Re: How to calculate the difference between two timestamps of different obs

If your dataset is already sorted by ID TIMESTAMP, then you can use the following:

data want;

     set have;

     by id;

     if not last.id then do;

          i=_N_+1;

          set have(rename=(timestamp=nextstamp) keep=nextstamp) point=i;

          duration=intck('day', timestamp, nextstamp);

     end;

     *drop nextstamp i;

run;

you may have to change the keep= set option to keep=timestamp. I always forget if rename applies before or after keep for set options. This effectively reads in the next TIMESTAMP into a new variable NEXTSTAMP using direct addressing of _N_+1 since your data is presorted.

edit the logic is very similar to that of PG

*edit corrected after PGs' comment. That is the kind of syntax restrictions that I always figure out debugging after my logic is established!

Vince

Respected Advisor
Posts: 4,659

Re: How to calculate the difference between two timestamps of different obs

, that's a clever use of random access. However, expressions are not allowed as values to the POINT= option. You must use an intermediate variable to hold the expression.

PG

PG
Frequent Contributor
Posts: 120

Re: How to calculate the difference between two timestamps of different obs

This involves double sorting, but if your data is not too big it might work....Smiley Happy

data test;
length id 8 status $10 timestamp 8;
input id status timestamp anydtdte.;
format timestamp yymmdd10.;
datalines;
1122 idle 10may2013
1122 active 15may2013
1122 idle 25may2013
1122 closed 01jun2013
;
run;

proc sort data=test;
by id descending timestamp;
run;

data new;
set test;
retain tmp;
format nextts yymmdd10.;

if _n_ = 1 then tmp = .;
nextts = tmp;

if timestamp ne . then tmp = timestamp;

duration=intck('day', timestamp, nextts);
run;

proc sort data=new;
by id timestamp;
run;

Ask a Question
Discussion stats
  • 5 replies
  • 982 views
  • 0 likes
  • 5 in conversation