Hi,
I am trying to calculate the number of visits happening "two hours" prior to a given sas datetime.
My data has 60k records for 1 year worth of data.
Visit ID Triage_Time
1 Oct 1, 2017 00:38
2 Oct 1, 2017 00:50
3 Oct 1, 2017 01:50
4 Oct 1, 2017 01:52
5 Oct 1, 2017 03:50
I want my output to be
count(visit ids) if the earliest triage time is two hours prior to the current triage time
So here for the visit 4, the number of visits with triage time 2 hours earlier (earlier than 1:52) will be visit id 1,2,3. (3 visits).
If your data are sorted by triage_time, then you can:
Stated this way, it is a simple task:
data have;
input Visit_ID Triage_Time datetime.;
format Triage_Time datetime22.;
cards;
1 01Oct2017:00:38
2 01Oct2017:00:50
3 01Oct2017:01:50
4 01Oct2017:01:52
5 01Oct2017:03:50
;
data want (drop=_:);
set have ;
do while (triage_time-'02:00:00't > hist_time);
set have (keep=triage_time rename=triage_time=hist_time);
_n2+1;
end;
count=_n_-(_n2-1);
run;
The "trick" here is that "_n2+1", a "summing statement" that does nothing more than track the position of the second SET statement as it progress through the data set. Then just compare the "hnistorical" _n2 position to the current position _N_.
Edited additional note:
The "bonus" of the above is that the output dataset have the variable HIST_TIME, which is the first time-stamp within the 2-hour windows. You can use a LAG function as below to get the last time-stamp OUTSIDE the 2-hour window:
data want (drop=_:);
set have ;
do while (triage_time-'02:00:00't > hist_time);
set have (keep=triage_time rename=triage_time=hist_time);
prior_hist_time=lag(hist_time);
format prior_hist_time datetime22.;
retain prior_hist_time;
_n2+1;
end;
count=_n_-(_n2-1);
run;
2nd edit: I added the RETAIN statement.
I have tried this
proc sort data=test;
by Triage_Completed_DTS;
run;
data temp2;
set test;
by Triage_Completed_DTS;
retain Triage_Completed_DTS;
if first.Triage_Completed_DTS then ts2=Triage_Completed_DTS;
hour=(intck('hour',ts2,Triage_Completed_DTS)*120);
minutes=intck('minute',ts2,Triage_Completed_DTS);
diff=abs(hour-minutes);
format TS2 datetime20.;
run;
But it gives me 0 hours for the ts2
Triage_completed_DTS is a date time field.
I tried my program using triage date (only date)
data want;
set triage;
by Triage_Completed_Date;
if first.Triage_Completed_Date then do;
n_visits = 1;
end;
else n_visits + 1;
if last.Triage_Completed_Date;
run;
This gave me number of visits on that triage date.
May be if I could modify the second program and use date time instead of date only and create another variable which gives me the date time of 2 hours prior to this date time and then I can count those visits using the second program.
Hi @jak1351
This is a good case for "joining a table with itself". An inner select to join all records in left table with all records from same table as right table where the timestamp is within two hours behind the timestamp in the right table , and an outer select do to the summation and calculate count.
* Create input data;
data have;
input Visit_ID Triage_Time datetime.;
format Triage_Time datetime22.;
cards;
1 01Oct2017:00:38
2 01Oct2017:00:50
3 01Oct2017:01:50
4 01Oct2017:01:52
5 01Oct2017:03:50
;
run;
* Calculate number of visits happening two hours prior to current visit;
proc sql;
create table want as
select distinct c.Visit_ID, c.Triage_Time, count(*)-1 as visits_within_2_hours
from (
select a.Visit_ID, a.Triage_Time
from have as a left join have as b
on b.Triage_Time <= a.Triage_Time and b.Triage_Time >= a.Triage_Time-7200
) as c
group by c.Visit_ID
order by c.Visit_ID;
quit;
result:
If your data are sorted by triage_time, then you can:
Stated this way, it is a simple task:
data have;
input Visit_ID Triage_Time datetime.;
format Triage_Time datetime22.;
cards;
1 01Oct2017:00:38
2 01Oct2017:00:50
3 01Oct2017:01:50
4 01Oct2017:01:52
5 01Oct2017:03:50
;
data want (drop=_:);
set have ;
do while (triage_time-'02:00:00't > hist_time);
set have (keep=triage_time rename=triage_time=hist_time);
_n2+1;
end;
count=_n_-(_n2-1);
run;
The "trick" here is that "_n2+1", a "summing statement" that does nothing more than track the position of the second SET statement as it progress through the data set. Then just compare the "hnistorical" _n2 position to the current position _N_.
Edited additional note:
The "bonus" of the above is that the output dataset have the variable HIST_TIME, which is the first time-stamp within the 2-hour windows. You can use a LAG function as below to get the last time-stamp OUTSIDE the 2-hour window:
data want (drop=_:);
set have ;
do while (triage_time-'02:00:00't > hist_time);
set have (keep=triage_time rename=triage_time=hist_time);
prior_hist_time=lag(hist_time);
format prior_hist_time datetime22.;
retain prior_hist_time;
_n2+1;
end;
count=_n_-(_n2-1);
run;
2nd edit: I added the RETAIN statement.
Thanks so much. It worked!
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.