BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jak1351
Fluorite | Level 6

Hi,

I am trying to calculate the number of visits happening  "two hours" prior to a given sas datetime.

 

My data has 60k records for 1 year worth of data.

 

Visit ID                  Triage_Time 

1                            Oct 1, 2017 00:38

2                            Oct 1, 2017 00:50

3                            Oct 1, 2017 01:50

4                            Oct 1, 2017 01:52

5                            Oct 1, 2017 03:50

 

I want my output to be

count(visit ids) if the earliest triage time is two hours prior to the current triage time

So here for the visit 4, the number of visits with triage time 2 hours earlier (earlier than 1:52) will be visit id 1,2,3. (3 visits).

    

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

If your data are sorted by triage_time, then you can:

 

  1. Read using the SET statement an observation (whose sequence in the data set is _N_).
  2. In a separate input stream (i.e. a separate SET statement), read (and COUNT) the records until the time difference between the triage_time read in step 1 and the triage_time read in step 2 (renamed to hist_time) is less than or equal to 2 hours.

 

Stated this way, it is a simple task:

 

data have;
	input Visit_ID  Triage_Time datetime.;
	format Triage_Time datetime22.;
cards;
1 01Oct2017:00:38
2 01Oct2017:00:50
3 01Oct2017:01:50
4 01Oct2017:01:52
5 01Oct2017:03:50
;

data want (drop=_:);
  set have ;
  do while (triage_time-'02:00:00't > hist_time);
    set have (keep=triage_time rename=triage_time=hist_time);
    _n2+1;
  end;
  count=_n_-(_n2-1);
run;

The "trick" here is that "_n2+1",  a "summing statement" that does nothing more than track the position of the second SET statement as it progress through the data set.   Then just compare the "hnistorical" _n2 position to the current position _N_.

 

Edited additional note:

 

The "bonus" of the above is that the output dataset have the variable HIST_TIME, which is the first time-stamp within the 2-hour windows.  You can use a LAG function as below to get the last time-stamp OUTSIDE the 2-hour window:

 

data want (drop=_:);
  set have ;
  do while (triage_time-'02:00:00't > hist_time);
    set have (keep=triage_time rename=triage_time=hist_time);
    prior_hist_time=lag(hist_time);
    format prior_hist_time datetime22.;
    retain prior_hist_time;
    _n2+1;
  end;
  count=_n_-(_n2-1);
run;

 

2nd edit:  I added the RETAIN statement. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

8 REPLIES 8
tomrvincent
Rhodochrosite | Level 12
What have you tried so far?
jak1351
Fluorite | Level 6

I have tried this

 

proc sort data=test;
by Triage_Completed_DTS;
run;

 

data temp2;
set test;
by Triage_Completed_DTS;
retain Triage_Completed_DTS;
if first.Triage_Completed_DTS then ts2=Triage_Completed_DTS;
hour=(intck('hour',ts2,Triage_Completed_DTS)*120);
minutes=intck('minute',ts2,Triage_Completed_DTS);
diff=abs(hour-minutes);
format TS2 datetime20.;
run;

 

But it gives me 0 hours for the ts2

 

Triage_completed_DTS is a date time field.

 

I tried my program using triage date (only date)

data want;
set triage;
by Triage_Completed_Date;
if first.Triage_Completed_Date then do;
n_visits = 1;
end;
else n_visits + 1;
if last.Triage_Completed_Date;
run;

 

This gave me number of visits on that triage date. 

 

May be if I could modify the second program and use date time instead of date only and create another variable which gives me the date time of 2 hours prior to this date time and then I can count those visits using the second program.

tomrvincent
Rhodochrosite | Level 12
are you able to retrieve the hour from Triage_Completed_DTS directly?
jak1351
Fluorite | Level 6
I could but I want it to be done at the date and time interval. Not just the hour.
tomrvincent
Rhodochrosite | Level 12
I suggested that because you need to go one step at a time. Get the hour, get 2 hours earlier, then find the visits.
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @jak1351 

 

This is a good case for "joining a table with itself". An inner select to join all records in left table with all records from same table as right table where the timestamp is within two hours behind the timestamp in the right table , and an outer select do to the summation and calculate count.

 

* Create input data;
data have;
	input Visit_ID  Triage_Time datetime.;
	format Triage_Time datetime22.;
cards;
1 01Oct2017:00:38
2 01Oct2017:00:50
3 01Oct2017:01:50
4 01Oct2017:01:52
5 01Oct2017:03:50
;
run;

* Calculate number of visits happening two hours prior to current visit;
proc sql;
	create table want as
		select distinct c.Visit_ID, c.Triage_Time, count(*)-1 as visits_within_2_hours
		from (
			select a.Visit_ID, a.Triage_Time
			from have as a left join have as b 
			on b.Triage_Time <= a.Triage_Time and b.Triage_Time >= a.Triage_Time-7200
		) as c
		group by c.Visit_ID
		order by c.Visit_ID;
quit;

result:

 

 

cnt.gif

 

mkeintz
PROC Star

If your data are sorted by triage_time, then you can:

 

  1. Read using the SET statement an observation (whose sequence in the data set is _N_).
  2. In a separate input stream (i.e. a separate SET statement), read (and COUNT) the records until the time difference between the triage_time read in step 1 and the triage_time read in step 2 (renamed to hist_time) is less than or equal to 2 hours.

 

Stated this way, it is a simple task:

 

data have;
	input Visit_ID  Triage_Time datetime.;
	format Triage_Time datetime22.;
cards;
1 01Oct2017:00:38
2 01Oct2017:00:50
3 01Oct2017:01:50
4 01Oct2017:01:52
5 01Oct2017:03:50
;

data want (drop=_:);
  set have ;
  do while (triage_time-'02:00:00't > hist_time);
    set have (keep=triage_time rename=triage_time=hist_time);
    _n2+1;
  end;
  count=_n_-(_n2-1);
run;

The "trick" here is that "_n2+1",  a "summing statement" that does nothing more than track the position of the second SET statement as it progress through the data set.   Then just compare the "hnistorical" _n2 position to the current position _N_.

 

Edited additional note:

 

The "bonus" of the above is that the output dataset have the variable HIST_TIME, which is the first time-stamp within the 2-hour windows.  You can use a LAG function as below to get the last time-stamp OUTSIDE the 2-hour window:

 

data want (drop=_:);
  set have ;
  do while (triage_time-'02:00:00't > hist_time);
    set have (keep=triage_time rename=triage_time=hist_time);
    prior_hist_time=lag(hist_time);
    format prior_hist_time datetime22.;
    retain prior_hist_time;
    _n2+1;
  end;
  count=_n_-(_n2-1);
run;

 

2nd edit:  I added the RETAIN statement. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
jak1351
Fluorite | Level 6

Thanks so much. It worked!

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 957 views
  • 1 like
  • 4 in conversation