BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jogianni
Calcite | Level 5

I have stacked user actions in a table and would like to calculate the number of seconds between each action for each user within that day. The issue is multiple users can access at once, so their actions are stacked, and I would like to only calculate that difference if the actions occurred on the same day. 

 

example:

data have:

 

idTimestampUserdate
13902672/19/2018 20:01:33Rick2/19/2018
13902732/19/2018 20:02:36Mike2/19/2018
13902762/19/2018 20:03:15Rick2/19/2018
13907632/19/2018 22:33:49Fred2/19/2018
13909792/19/2018 22:54:18Suzi2/19/2018
13911192/19/2018 23:12:39Fred2/19/2018
13911242/19/2018 23:13:21Fred2/19/2018
13922792/20/2018 22:54:00Suzi2/20/2018
13913392/20/2018 23:12:00Fred2/20/2018

 

 

data want:

 

idTimestampUserdateDif
13902672/19/2018 20:01:33Rick2/19/20180
13902732/19/2018 20:02:36Mike2/19/20180
13902762/19/2018 20:03:15Rick2/19/201887
13907632/19/2018 22:33:49Fred2/19/20180
13909792/19/2018 22:54:18Suzi2/19/20180
13911192/19/2018 23:12:39Fred2/19/20182171
13911242/19/2018 23:13:21Fred2/19/201842
13922792/20/2018 22:54:00Suzi2/20/20180
13913392/20/2018 23:12:00Fred2/20/20180
1 ACCEPTED SOLUTION

Accepted Solutions
SuryaKiran
Meteorite | Level 14

Here is another alternate way by using LAG and INTCK functions

proc sort data=have1;
by user TimeStamp;
run;

data want(drop=LAG);
format LAG DATETIME24.;
set have1;
LAG=LAG(TimeStamp);
by user;
if first.user then dif=0;
else dif=intCK("Second",LAG,TimeStamp);
run;
Thanks,
Suryakiran

View solution in original post

6 REPLIES 6
ballardw
Super User

Please indicate which rows are being compared to get the values of 87 and 2171. The only one I can duplicate is the 42.

 

The following possible solution assumes that the timestamp variable is an actual SAS datetime valued variable.

proc sort data=have;
   by user timestamp;
run;

data want;
   set have;
   by user timestamp;
   difstamp = dif(timestamp);
   if first.user then do;
      dif=0;
   end;
   else dif=difstamp;
   drop difstamp;
run;

proc sort data=want;
   by timestamp;
run;
SuryaKiran
Meteorite | Level 14

Here is another alternate way by using LAG and INTCK functions

proc sort data=have1;
by user TimeStamp;
run;

data want(drop=LAG);
format LAG DATETIME24.;
set have1;
LAG=LAG(TimeStamp);
by user;
if first.user then dif=0;
else dif=intCK("Second",LAG,TimeStamp);
run;
Thanks,
Suryakiran
jogianni
Calcite | Level 5

thanks, though this doesn't reset at day I used it and used on there:

 

 

data want;
format LAG DATETIME24.;
set have;
LAG=LAG(timestamp);
by user date;
if first.date then dif=0;
else dif=intCK("Second",LAG,timestamp);
run;
xd9813
Fluorite | Level 6
For a newbie it is a bit difficult to understand LAG=LAG(TimeStamp) why not pick var names that a different from SAS function names at these forums? Add a simple _ before the name at the very least.
Astounding
PROC Star

I think the proposed solutions have the right idea, but fail to account for changes in the day.  A slightly improved version:

 

proc sort data=have;

   by user date timestamp;

run;

 

data want;

   set have;

   by user date;

   dif = dif(timestamp);

   if first.date them dif = 0;

run;

 

If desired, you can always re-sort the data set by TIMESTAMP at that point.

mkeintz
PROC Star

Your data appear to already by ordered by timestamp.  While the "proc sort; by user timestamp;" solution will work, it will be an expensive solution if the data set is big, especially if you want to re-sort to original order.  That a lot of disk input/output activity and disk storage.

 

To avoid that inefficiency, you can use a hash object, keyed on USER, holding the most recent timestamp and most recent date (in variables last_ts and last_dt).  Then you can retrieve from the hash object, compare DATE to LAST_DT and, if appropriate subtract TIMESTAMP minus LAST_TS:

 

data have;
  input id dummy :mmddyy10. Timestamp :time8.0 User :$4. date :mmddyy10.;
  timestamp=dummy*24*60*60+timestamp;
  drop dummy;
  format timestamp datetime19.0 date date9.;
datalines;
1390267 2/19/2018 20:01:33 Rick 2/19/2018 
1390273 2/19/2018 20:02:36 Mike 2/19/2018 
1390276 2/19/2018 20:03:15 Rick 2/19/2018 
1390763 2/19/2018 22:33:49 Fred 2/19/2018 
1390979 2/19/2018 22:54:18 Suzi 2/19/2018 
1391119 2/19/2018 23:12:39 Fred 2/19/2018 
1391124 2/19/2018 23:13:21 Fred 2/19/2018 
1392279 2/20/2018 22:54:00 Suzi 2/20/2018 
1391339 2/20/2018 23:12:00 Fred 2/20/2018 
run;


data want;
  set have  ;
  if _n_=1 then do;
    if 0 then set have (keep=date timestamp rename=(date=last_dt timestamp=last_ts));
    declare hash h (dataset:'have (keep=user date timestamp
                               rename=(date=last_dt timestamp=last_ts)');
     h.definekey('user');
	 h.definedata('last_dt','last_ts');
	 h.definedone();
  end;
  rc=h.find();
  if rc=0 and date=last_dt then dif=timestamp-last_ts;
  else dif=0;
  h.replace(key:user,data:date,data:timestamp);
  drop last_: rc;
run;

 

The advantage is that no data sorting is required.  The disadvantage is the need to develop and understanding of the hash object, which was not intuitively obvious to me when I first came across it.

 

By the way, what if a user has 2 simultaneous records?  Then DIF is a true zero, as opposed to a zero representing "not appropriate".  In that case you might drop the "else dif=0;" statement, which will leave DIF as a missing value.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 10460 views
  • 0 likes
  • 6 in conversation