BookmarkSubscribeRSS Feed
PhilfromGermany
Fluorite | Level 6

Hello, I have the following data set:

ID

StatusTimestamp


1122idle10MAY2013
1122active15MAY2013
1122idle25MAY2013
1122closed01JUN2013

Every time the status of an ID changes, a new record for that ID is being created and a timestamp of when the change occured is added to the record. That is how old statuses get historicized, as can be seen in the table.

I would like to add a column that measures how long a certain status has been assigned to an ID. The outcome would look like this:

ID

StatusTimestampDuration in days
1122idle10MAY20135
1122active15MAY201310
1122idle25MAY20137
1122closed01JUN2013...

The last value (the cell that contains "...") would be created through additional logic, like

if last.id then duration=today()-timestamp

or something like that.

Your help is much appreciated!

5 REPLIES 5
PGStats
Opal | Level 21

You can do it like this:

data want;

set have; by id;

lastId = last.id;

if not endNext then

     set have(firstobs=2 keep=timeStamp rename=(timeStamp=nextTimeStamp))

          end=endNext;

if lastId

     then duration= intck("DAY", timeStamp, today());    

     else duration = intck('DAY', timeStamp, nextTimeStamp);

drop lastId nextTimeStamp;

run;

PG

Message was edited by: PG : Added END= condition to prevent premature datastep ending.

PG
Haikuo
Onyx | Level 15

A proc sql way:

proc sql;

  select a.*, b.timestamp-a.timestamp as duration

  from have a

left join have b

on a.id=b.id

and a.timestamp < b.timestamp

group by a.id, a.status, a.timestamp

having b.timestamp=min(b.timestamp)

  order by id,timestamp;

  quit;

Haikuo

Vince28_Statcan
Quartz | Level 8

If your dataset is already sorted by ID TIMESTAMP, then you can use the following:

data want;

     set have;

     by id;

     if not last.id then do;

          i=_N_+1;

          set have(rename=(timestamp=nextstamp) keep=nextstamp) point=i;

          duration=intck('day', timestamp, nextstamp);

     end;

     *drop nextstamp i;

run;

you may have to change the keep= set option to keep=timestamp. I always forget if rename applies before or after keep for set options. This effectively reads in the next TIMESTAMP into a new variable NEXTSTAMP using direct addressing of _N_+1 since your data is presorted.

edit the logic is very similar to that of PG

*edit corrected after PGs' comment. That is the kind of syntax restrictions that I always figure out debugging after my logic is established!

Vince

PGStats
Opal | Level 21

, that's a clever use of random access. However, expressions are not allowed as values to the POINT= option. You must use an intermediate variable to hold the expression.

PG

PG
FredrikE
Rhodochrosite | Level 12

This involves double sorting, but if your data is not too big it might work....:)

data test;
length id 8 status $10 timestamp 8;
input id status timestamp anydtdte.;
format timestamp yymmdd10.;
datalines;
1122 idle 10may2013
1122 active 15may2013
1122 idle 25may2013
1122 closed 01jun2013
;
run;

proc sort data=test;
by id descending timestamp;
run;

data new;
set test;
retain tmp;
format nextts yymmdd10.;

if _n_ = 1 then tmp = .;
nextts = tmp;

if timestamp ne . then tmp = timestamp;

duration=intck('day', timestamp, nextts);
run;

proc sort data=new;
by id timestamp;
run;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 5758 views
  • 0 likes
  • 5 in conversation