Help using Base SAS procedures

Calculate difference between two dates in the same column conditional to other columns

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Calculate difference between two dates in the same column conditional to other columns

I have stacked user actions in a table and would like to calculate the number of seconds between each action for each user within that day. The issue is multiple users can access at once, so their actions are stacked, and I would like to only calculate that difference if the actions occurred on the same day. 

 

example:

data have:

 

idTimestampUserdate
13902672/19/2018 20:01:33Rick2/19/2018
13902732/19/2018 20:02:36Mike2/19/2018
13902762/19/2018 20:03:15Rick2/19/2018
13907632/19/2018 22:33:49Fred2/19/2018
13909792/19/2018 22:54:18Suzi2/19/2018
13911192/19/2018 23:12:39Fred2/19/2018
13911242/19/2018 23:13:21Fred2/19/2018
13922792/20/2018 22:54:00Suzi2/20/2018
13913392/20/2018 23:12:00Fred2/20/2018

 

 

data want:

 

idTimestampUserdateDif
13902672/19/2018 20:01:33Rick2/19/20180
13902732/19/2018 20:02:36Mike2/19/20180
13902762/19/2018 20:03:15Rick2/19/201887
13907632/19/2018 22:33:49Fred2/19/20180
13909792/19/2018 22:54:18Suzi2/19/20180
13911192/19/2018 23:12:39Fred2/19/20182171
13911242/19/2018 23:13:21Fred2/19/201842
13922792/20/2018 22:54:00Suzi2/20/20180
13913392/20/2018 23:12:00Fred2/20/20180

Accepted Solutions
Solution
‎03-20-2018 02:49 PM
Valued Guide
Posts: 590

Re: Calculate difference between two dates in the same column conditional to other columns

Here is another alternate way by using LAG and INTCK functions

proc sort data=have1;
by user TimeStamp;
run;

data want(drop=LAG);
format LAG DATETIME24.;
set have1;
LAG=LAG(TimeStamp);
by user;
if first.user then dif=0;
else dif=intCK("Second",LAG,TimeStamp);
run;
Thanks,
Suryakiran

View solution in original post


All Replies
Super User
Posts: 13,523

Re: Calculate difference between two dates in the same column conditional to other columns

Please indicate which rows are being compared to get the values of 87 and 2171. The only one I can duplicate is the 42.

 

The following possible solution assumes that the timestamp variable is an actual SAS datetime valued variable.

proc sort data=have;
   by user timestamp;
run;

data want;
   set have;
   by user timestamp;
   difstamp = dif(timestamp);
   if first.user then do;
      dif=0;
   end;
   else dif=difstamp;
   drop difstamp;
run;

proc sort data=want;
   by timestamp;
run;
Solution
‎03-20-2018 02:49 PM
Valued Guide
Posts: 590

Re: Calculate difference between two dates in the same column conditional to other columns

Here is another alternate way by using LAG and INTCK functions

proc sort data=have1;
by user TimeStamp;
run;

data want(drop=LAG);
format LAG DATETIME24.;
set have1;
LAG=LAG(TimeStamp);
by user;
if first.user then dif=0;
else dif=intCK("Second",LAG,TimeStamp);
run;
Thanks,
Suryakiran
New Contributor
Posts: 3

Re: Calculate difference between two dates in the same column conditional to other columns

Posted in reply to SuryaKiran

thanks, though this doesn't reset at day I used it and used on there:

 

 

data want;
format LAG DATETIME24.;
set have;
LAG=LAG(timestamp);
by user date;
if first.date then dif=0;
else dif=intCK("Second",LAG,timestamp);
run;
Super User
Posts: 6,762

Re: Calculate difference between two dates in the same column conditional to other columns

[ Edited ]

I think the proposed solutions have the right idea, but fail to account for changes in the day.  A slightly improved version:

 

proc sort data=have;

   by user date timestamp;

run;

 

data want;

   set have;

   by user date;

   dif = dif(timestamp);

   if first.date them dif = 0;

run;

 

If desired, you can always re-sort the data set by TIMESTAMP at that point.

Trusted Advisor
Posts: 1,337

Re: Calculate difference between two dates in the same column conditional to other columns

Your data appear to already by ordered by timestamp.  While the "proc sort; by user timestamp;" solution will work, it will be an expensive solution if the data set is big, especially if you want to re-sort to original order.  That a lot of disk input/output activity and disk storage.

 

To avoid that inefficiency, you can use a hash object, keyed on USER, holding the most recent timestamp and most recent date (in variables last_ts and last_dt).  Then you can retrieve from the hash object, compare DATE to LAST_DT and, if appropriate subtract TIMESTAMP minus LAST_TS:

 

data have;
  input id dummy :mmddyy10. Timestamp :time8.0 User :$4. date :mmddyy10.;
  timestamp=dummy*24*60*60+timestamp;
  drop dummy;
  format timestamp datetime19.0 date date9.;
datalines;
1390267 2/19/2018 20:01:33 Rick 2/19/2018 
1390273 2/19/2018 20:02:36 Mike 2/19/2018 
1390276 2/19/2018 20:03:15 Rick 2/19/2018 
1390763 2/19/2018 22:33:49 Fred 2/19/2018 
1390979 2/19/2018 22:54:18 Suzi 2/19/2018 
1391119 2/19/2018 23:12:39 Fred 2/19/2018 
1391124 2/19/2018 23:13:21 Fred 2/19/2018 
1392279 2/20/2018 22:54:00 Suzi 2/20/2018 
1391339 2/20/2018 23:12:00 Fred 2/20/2018 
run;


data want;
  set have  ;
  if _n_=1 then do;
    if 0 then set have (keep=date timestamp rename=(date=last_dt timestamp=last_ts));
    declare hash h (dataset:'have (keep=user date timestamp
                               rename=(date=last_dt timestamp=last_ts)');
     h.definekey('user');
	 h.definedata('last_dt','last_ts');
	 h.definedone();
  end;
  rc=h.find();
  if rc=0 and date=last_dt then dif=timestamp-last_ts;
  else dif=0;
  h.replace(key:user,data:date,data:timestamp);
  drop last_: rc;
run;

 

The advantage is that no data sorting is required.  The disadvantage is the need to develop and understanding of the hash object, which was not intuitively obvious to me when I first came across it.

 

By the way, what if a user has 2 simultaneous records?  Then DIF is a true zero, as opposed to a zero representing "not appropriate".  In that case you might drop the "else dif=0;" statement, which will leave DIF as a missing value.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 676 views
  • 0 likes
  • 5 in conversation