BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SAS_USER_928
Obsidian | Level 7

Hi,

I need some help with writing a SAS code to handle the following scenario. 

 

I have a data set that has multiple sites; each site has multiple patients; and each patient can have data of up to 7 days (one row represents one day). The data set has form start and save time. Data set is also setup to have no duplicates. First day data of each patient should always have the previous save and open time (both are user defined variables) as NULL. e.g., sr.1, 8, and 9 will have previous open and save time variables as NULL. And using these variables, I want to calculate the the following for each patient at a site:

  • number of hours a patient took between today's and yesterday's start times (i.e., from today's start to previous start) and 
  • number of hours a patient took between today's open and yesterday's start times (i.e., from today's start to previous day save)
Sr. PT SITE START_TIME SAVE_TIME
1 001 01 4/9/2024 16:44 4/9/2024 16:46
2 001 01 4/10/2024 16:39 4/10/2024 16:40
3 001 01 4/11/2024 16:07 4/11/2024 16:08
4 001 01 4/12/2024 16:05 4/12/2024 16:06
5 001 01 4/13/2024 20:05 4/13/2024 20:06
6 001 01 4/14/2024 19:31 4/14/2024 19:52
7 001 01 4/15/2024 16:25 4/15/2024 16:25
8 002 01 4/15/2024 16:19 4/15/2024 16:20
9 003 01 4/15/2024 16:21 4/15/2024 16:22
10 002 02 4/3/2024 18:34 4/3/2024 18:41
11 002 02 4/4/2024 18:03 4/4/2024 18:09
12 002 02 4/6/2024 21:11 4/6/2024 21:14
13 002 02 4/6/2024 21:14 4/6/2024 21:16
14 002 02 4/7/2024 18:17 4/7/2024 18:20
15 002 02 4/8/2024 18:12 4/8/2024 18:14
16 002 02 4/9/2024 18:06 4/9/2024 18:08
17 004 02 4/5/2024 18:03 4/5/2024 18:10
18 004 02 4/6/2024 16:27 4/6/2024 16:32
19 004 02 4/7/2024 19:18 4/7/2024 19:22
20 004 02 4/8/2024 18:01 4/8/2024 18:07
21 004 02 4/9/2024 18:06 4/9/2024 18:16
22 004 02 4/10/2024 18:08 4/10/2024 18:10
23 004 02 4/11/2024 19:22 4/11/2024 19:26
24 005 02 4/8/2024 18:05 4/8/2024 18:07
25 005 02 4/9/2024 19:32 4/9/2024 19:34
26 005 02 4/10/2024 18:45 4/10/2024 18:47
27 005 02 4/11/2024 18:02 4/11/2024 18:04
28 005 02 4/12/2024 18:04 4/12/2024 18:07
29 005 02 4/13/2024 18:01 4/13/2024 18:03
30 005 02 4/14/2024 18:06 4/14/2024 18:08

 

Thanks so much in advance for all the help.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Just divide!

data TIME_GAP;
  set FORM;
  by site pt start_time;
  hours1=dif(start_time);
  hours2=-sum(-start_time,lag(save_time));
  if first.pt then call missing(of hours1 hours2);
  else do;
    hours1=hours1/'01:00:00't ;
    hours2=hours2/'01:00:00't;
  end;
run;

View solution in original post

8 REPLIES 8
Astounding
PROC Star

One key issue to resolve is this.  What is in your TIME variables?  Are they character strings, or are they numeric measurements of time (number of seconds)?

SAS_USER_928
Obsidian | Level 7

Hi,

 

My data set is in a sas7bdat format and the two date variables are in DATETIME20. The entire data set is already pre-processed (cleaned and sorted by Site, PT, and Start Time).

 

The only two things I want to do are:

 

1. to calculate the difference in hours between the current start and previous save time for the same patient. and

2. to calculate the difference in hours between the current save and previous save time for the same patient.

 

I already have the following code but it is not populating some values correctly. 

 

DATA TIME_GAP;
SET FORM;
BY SITE PT START_TIME;
/* Initialize below variables */
IF FIRST.SITE OR FIRST.PT THEN DO;
    PREV_START_TIME = .;
    PREV_SAVE_TIME = .; END;
    
ELSE DO;
    PREV_PT = LAG(PT);
    PREV_SAVE_TIME = LAG(SAVE_TIME);
    FORMAT PREV_SAVE_TIME DATETIME20.;
    PREV_START_TIME = LAG(START_TIME);
    FORMAT PREV_START_TIME DATETIME20.; 
 
    IF NOT MISSING(PREV_SAVE_TIME) AND PT = PREV_PT
    THEN START_SAVE_DIFF = ROUND(((START_TIME - PREV_SAVE_TIME) / 3600), 0.01);
     ELSE START_SAVE_DIFF = "";
     IF NOT MISSING(PREV_START_TIME) AND PT = PREV_PT 
     THEN START_START_DIFF = ROUND(((START_TIME - PREV_START_TIME) / 3600), 0.01);
      ELSE START_START_DIFF = ""; END;
RUN;

 

Thanks

Tom
Super User Tom
Super User

Seems simple enough.

First let's convert your listing into a dataset.  Your listing is using a tricky style for the DATETIME values.  We can use the ANYDTDTM informat to read those, but let's make sure to tell it that the MONTH is the first number not the DAY of the month.

options datestyle=mdy;
data have;
  infile cards truncover dsd dlm='|';
  input row PT $ SITE $ (START_TIME SAVE_TIME) ( :anydtdtm. );
  format START_TIME SAVE_TIME datetime19.;
cards;
1|001|01|4/9/2024 16:44|4/9/2024 16:46
2|001|01|4/10/2024 16:39|4/10/2024 16:40
3|001|01|4/11/2024 16:07|4/11/2024 16:08
4|001|01|4/12/2024 16:05|4/12/2024 16:06
5|001|01|4/13/2024 20:05|4/13/2024 20:06
6|001|01|4/14/2024 19:31|4/14/2024 19:52
7|001|01|4/15/2024 16:25|4/15/2024 16:25
8|002|01|4/15/2024 16:19|4/15/2024 16:20
9|003|01|4/15/2024 16:21|4/15/2024 16:22
10|002|02|4/3/2024 18:34|4/3/2024 18:41
11|002|02|4/4/2024 18:03|4/4/2024 18:09
12|002|02|4/6/2024 21:11|4/6/2024 21:14
13|002|02|4/6/2024 21:14|4/6/2024 21:16
14|002|02|4/7/2024 18:17|4/7/2024 18:20
15|002|02|4/8/2024 18:12|4/8/2024 18:14
16|002|02|4/9/2024 18:06|4/9/2024 18:08
17|004|02|4/5/2024 18:03|4/5/2024 18:10
18|004|02|4/6/2024 16:27|4/6/2024 16:32
19|004|02|4/7/2024 19:18|4/7/2024 19:22
20|004|02|4/8/2024 18:01|4/8/2024 18:07
21|004|02|4/9/2024 18:06|4/9/2024 18:16
22|004|02|4/10/2024 18:08|4/10/2024 18:10
23|004|02|4/11/2024 19:22|4/11/2024 19:26
24|005|02|4/8/2024 18:05|4/8/2024 18:07
25|005|02|4/9/2024 19:32|4/9/2024 19:34
26|005|02|4/10/2024 18:45|4/10/2024 18:47
27|005|02|4/11/2024 18:02|4/11/2024 18:04
28|005|02|4/12/2024 18:04|4/12/2024 18:07
29|005|02|4/13/2024 18:01|4/13/2024 18:03
30|005|02|4/14/2024 18:06|4/14/2024 18:08
;

To get the difference between the current and the previous value just use the DIF() function.  To get the difference between the current START and previous SAVE we will need to use the LAG() function.  We can use BY group processing to know when to ignore the calculated values since the previous observation was for someone else.  Note the extra minus in the second calculation eliminates the note about multiplying -1 times a missing value for the first observation since there is not previous value of SAVE_TIME for LAG() to return.

Let's just leave the differences in seconds.  If you want them in hours just divide by the number of seconds in an hour. (Hint you can use '01:00:00't if you don't want to bother to figure out how many seconds that is.)

data want;
  set have;
  by site pt start_time;
  time1=dif(start_time);
  time2=-sum(-start_time,lag(save_time));
  if first.pt then call missing(of time1 time2);
  format time: time12.;
run;

Results

row PT  SITE          START_TIME           SAVE_TIME        time1        time2

  1 001  01   09APR2024:16:44:00  09APR2024:16:46:00            .            .
  2 001  01   10APR2024:16:39:00  10APR2024:16:40:00     23:55:00     23:53:00
  3 001  01   11APR2024:16:07:00  11APR2024:16:08:00     23:28:00     23:27:00
  4 001  01   12APR2024:16:05:00  12APR2024:16:06:00     23:58:00     23:57:00
  5 001  01   13APR2024:20:05:00  13APR2024:20:06:00     28:00:00     27:59:00
  6 001  01   14APR2024:19:31:00  14APR2024:19:52:00     23:26:00     23:25:00
  7 001  01   15APR2024:16:25:00  15APR2024:16:25:00     20:54:00     20:33:00
  8 002  01   15APR2024:16:19:00  15APR2024:16:20:00            .            .
  9 003  01   15APR2024:16:21:00  15APR2024:16:22:00            .            .
 10 002  02   03APR2024:18:34:00  03APR2024:18:41:00            .            .
 11 002  02   04APR2024:18:03:00  04APR2024:18:09:00     23:29:00     23:22:00
 12 002  02   06APR2024:21:11:00  06APR2024:21:14:00     51:08:00     51:02:00
 13 002  02   06APR2024:21:14:00  06APR2024:21:16:00      0:03:00      0:00:00
 14 002  02   07APR2024:18:17:00  07APR2024:18:20:00     21:03:00     21:01:00
 15 002  02   08APR2024:18:12:00  08APR2024:18:14:00     23:55:00     23:52:00
 16 002  02   09APR2024:18:06:00  09APR2024:18:08:00     23:54:00     23:52:00
 17 004  02   05APR2024:18:03:00  05APR2024:18:10:00            .            .
 18 004  02   06APR2024:16:27:00  06APR2024:16:32:00     22:24:00     22:17:00
 19 004  02   07APR2024:19:18:00  07APR2024:19:22:00     26:51:00     26:46:00
 20 004  02   08APR2024:18:01:00  08APR2024:18:07:00     22:43:00     22:39:00
 21 004  02   09APR2024:18:06:00  09APR2024:18:16:00     24:05:00     23:59:00
 22 004  02   10APR2024:18:08:00  10APR2024:18:10:00     24:02:00     23:52:00
 23 004  02   11APR2024:19:22:00  11APR2024:19:26:00     25:14:00     25:12:00
 24 005  02   08APR2024:18:05:00  08APR2024:18:07:00            .            .
 25 005  02   09APR2024:19:32:00  09APR2024:19:34:00     25:27:00     25:25:00
 26 005  02   10APR2024:18:45:00  10APR2024:18:47:00     23:13:00     23:11:00
 27 005  02   11APR2024:18:02:00  11APR2024:18:04:00     23:17:00     23:15:00
 28 005  02   12APR2024:18:04:00  12APR2024:18:07:00     24:02:00     24:00:00
 29 005  02   13APR2024:18:01:00  13APR2024:18:03:00     23:57:00     23:54:00
 30 005  02   14APR2024:18:06:00  14APR2024:18:08:00     24:05:00     24:03:00

 

SAS_USER_928
Obsidian | Level 7

Thanks, Tom.

The code seems to work fine but a few things to add or needs modification for me to use it. You have used datalines but my data set is in sas7bdat format. Also the two data variables are in DATETIME20. format. The difference in your code below is in hours:minutes:seconds format. I need it numeric such as 25 or 30 so that I can use it further to calculate average of hours per site.

Thank so much in advance for your time and help.

Tom
Super User Tom
Super User

I only include the first data step so I had your dataset.  You can just start with the second step.

 

DATETIME and TIME values are in seconds.  You can easily convert them into hours using basic arithmetic. The only question is what to do with the minutes and seconds.  Do you want the HOURS to include the fraction of an hour?  Like 23.5 for '23:30:00't ?

If not do you want to eliminate the fraction of an hour?

hours=int(time/'01:00:00't);

Do you want to round to the closest hour?

hours=round(time/'01:00:00't);

Do you want to round UP to the next hour?

hours=ceil(time/'01:00:00't);
SAS_USER_928
Obsidian | Level 7
Yes, I like the HOURS to include fraction of an hour, such as 23.5 for 23 hours 30 seconds. So which line in the above code should be modified to address this? Thanks
Tom
Super User Tom
Super User

Just divide!

data TIME_GAP;
  set FORM;
  by site pt start_time;
  hours1=dif(start_time);
  hours2=-sum(-start_time,lag(save_time));
  if first.pt then call missing(of hours1 hours2);
  else do;
    hours1=hours1/'01:00:00't ;
    hours2=hours2/'01:00:00't;
  end;
run;
SAS_USER_928
Obsidian | Level 7
Thank you so much, Tom. Much appreciate it.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 484 views
  • 1 like
  • 3 in conversation