BookmarkSubscribeRSS Feed
PhilfromGermany
Fluorite | Level 6

Hello, I have the following data set:

ID

StatusTimestamp


1122idle10MAY2013
1122active15MAY2013
1122idle25MAY2013
1122closed01JUN2013

Every time the status of an ID changes, a new record for that ID is being created and a timestamp of when the change occured is added to the record. That is how old statuses get historicized, as can be seen in the table.

I would like to add a column that measures how long a certain status has been assigned to an ID. The outcome would look like this:

ID

StatusTimestampDuration in days
1122idle10MAY20135
1122active15MAY201310
1122idle25MAY20137
1122closed01JUN2013...

The last value (the cell that contains "...") would be created through additional logic, like

if last.id then duration=today()-timestamp

or something like that.

Your help is much appreciated!

5 REPLIES 5
PGStats
Opal | Level 21

You can do it like this:

data want;

set have; by id;

lastId = last.id;

if not endNext then

     set have(firstobs=2 keep=timeStamp rename=(timeStamp=nextTimeStamp))

          end=endNext;

if lastId

     then duration= intck("DAY", timeStamp, today());    

     else duration = intck('DAY', timeStamp, nextTimeStamp);

drop lastId nextTimeStamp;

run;

PG

Message was edited by: PG : Added END= condition to prevent premature datastep ending.

PG
Haikuo
Onyx | Level 15

A proc sql way:

proc sql;

  select a.*, b.timestamp-a.timestamp as duration

  from have a

left join have b

on a.id=b.id

and a.timestamp < b.timestamp

group by a.id, a.status, a.timestamp

having b.timestamp=min(b.timestamp)

  order by id,timestamp;

  quit;

Haikuo

Vince28_Statcan
Quartz | Level 8

If your dataset is already sorted by ID TIMESTAMP, then you can use the following:

data want;

     set have;

     by id;

     if not last.id then do;

          i=_N_+1;

          set have(rename=(timestamp=nextstamp) keep=nextstamp) point=i;

          duration=intck('day', timestamp, nextstamp);

     end;

     *drop nextstamp i;

run;

you may have to change the keep= set option to keep=timestamp. I always forget if rename applies before or after keep for set options. This effectively reads in the next TIMESTAMP into a new variable NEXTSTAMP using direct addressing of _N_+1 since your data is presorted.

edit the logic is very similar to that of PG

*edit corrected after PGs' comment. That is the kind of syntax restrictions that I always figure out debugging after my logic is established!

Vince

PGStats
Opal | Level 21

, that's a clever use of random access. However, expressions are not allowed as values to the POINT= option. You must use an intermediate variable to hold the expression.

PG

PG
FredrikE
Rhodochrosite | Level 12

This involves double sorting, but if your data is not too big it might work....:)

data test;
length id 8 status $10 timestamp 8;
input id status timestamp anydtdte.;
format timestamp yymmdd10.;
datalines;
1122 idle 10may2013
1122 active 15may2013
1122 idle 25may2013
1122 closed 01jun2013
;
run;

proc sort data=test;
by id descending timestamp;
run;

data new;
set test;
retain tmp;
format nextts yymmdd10.;

if _n_ = 1 then tmp = .;
nextts = tmp;

if timestamp ne . then tmp = timestamp;

duration=intck('day', timestamp, nextts);
run;

proc sort data=new;
by id timestamp;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 5914 views
  • 0 likes
  • 5 in conversation