Hello,
I have data like below. I want to find how long, in hours, is the dose above 15 for each ID.
So the 1st answer would be: 6/7/17 9:56 PM - 6/7/17 7:39 PM (initiation time) = went over 15
Then 6/8/17 3:14 AM - 6/7/17 9:56 PM = how long they were over 15.
Find time difference again, if it happens again.
I tried using the lag function with first.ID .... but could not get correct answer.
Any help is appreciated and thanks in advance!
ID | TIME | DOSE |
101 | 6/7/17 7:39 PM | 14 |
101 | 6/7/17 8:48 PM | 15 |
101 | 6/7/17 9:56 PM | 16 |
101 | 6/8/17 12:15 AM | 18 |
101 | 6/8/17 1:12 AM | 20 |
101 | 6/8/17 3:14 AM | 14 |
101 | 6/8/17 6:42 AM | 13 |
101 | 6/8/17 7:38 AM | 6 |
101 | 6/8/17 8:06 AM | 7 |
101 | 6/8/17 9:03 AM | 8 |
102 | 6/8/17 10:18 AM | 12 |
102 | 6/8/17 11:02 AM | 10 |
102 | 6/8/17 12:55 PM | 11 |
102 | 6/8/17 3:17 PM | 20 |
102 | 6/8/17 4:07 PM | 21 |
102 | 6/8/17 5:48 PM | 14 |
102 | 6/8/17 7:21 PM | 14 |
102 | 6/8/17 8:05 PM | 13 |
Can you show what you expect the output to look like for that example data? It isn't clear if you are looking ot add a single variable to each observation or possibly collapse the data to intervals.
Something like this, I entered this, as an example, don't need the last column:
ID | TIME | DOSE | Diff | |
101 | 6/7/17 7:39 PM | 14 | . | |
101 | 6/7/17 8:48 PM | 15 | . | |
101 | 6/7/17 9:56 PM | 16 | 2:17 | = 6/7/2017 9:56:00 PM - 6/7/2017 7:39:00 PM |
101 | 6/8/17 12:15 AM | 18 | . | |
101 | 6/8/17 1:12 AM | 20 | . | |
101 | 6/8/17 3:14 AM | 14 | 5:18 | = 6/8/2017 3:14:00 AM - 6/7/2017 9:56:00 PM |
101 | 6/8/17 6:42 AM | 13 | . | |
101 | 6/8/17 7:38 AM | 6 | . | |
101 | 6/8/17 8:06 AM | 7 | . | |
101 | 6/8/17 9:03 AM | 8 | ||
102 | 6/8/17 10:18 AM | 12 | . | |
102 | 6/8/17 11:02 AM | 10 | . | |
102 | 6/8/17 12:55 PM | 11 | . | |
102 | 6/8/17 3:17 PM | 20 | 4:59 | |
102 | 6/8/17 4:07 PM | 21 | ||
102 | 6/8/17 5:48 PM | 14 | 2:31 | |
102 | 6/8/17 7:21 PM | 14 | ||
102 | 6/8/17 8:05 PM | 13 |
I would really like to know HOW that data is collected and what the triggering event is to have a specific time recorded. Your example data is not on a regular interval so it appears that the time represents some action. If the "action" is not changing the dose then you do not know any value for any specific interval, only a point in time.
If the "action" is a detectable change in dose then your calculations make no sense.
At best I would say that your "interval above 15" descriptions are more like "longest possible interval where the dose might be > 14".
Using the lag function should be the way to go - one common mistake is to put lag() inside conditional code. That won't work - if you post the code you've tried it might be a simple fix.
I don't have any code that works. These are some tries:
data dose3;
set dose2;
*retain hsp_account_id taken_time dose; (not sure if I should have this or not)
by id notsorted;
if first.id then do;
time1=time;
format time1 datetime16.;
output;
end;
if not first.id then do;
if dose > 15 then time1=time;
format time1 datetime16.;
output;
end;
run;
data dose4;
set dose2;
by id notsorted;
if not first.id then do;
if dose > 15 then LAG_time = lag(time) ;
hours=hour(LAG_time);
*format lag_time time.;
end;
run ;
/** a different approach**/
data dose6;
set dose2;
t1=dose-15;
if dose=15 then t2=1 ; else t2=sign(t1);
run;
Then I wanted to do for column t2, ABS(previous row - current row) ...and whereever I get a 2, find the time difference. I couldn't do it...
It would be great if you can offer the sample data in data step in the future. Not knowing your wanted outcome, I would do this in two steps, first step is to flag the eligible period, after this, there could be more than one approaches:
data have;
input ID TIME MDYAMPM15. DOSE;
format time datetime25.;
cards;
101 6/7/17 7:39 PM 14
101 6/7/17 8:48 PM 15
101 6/7/17 9:56 PM 16
101 6/8/17 12:15 AM 18
101 6/8/17 1:12 AM 20
101 6/8/17 3:14 AM 14
101 6/8/17 6:42 AM 13
101 6/8/17 7:38 AM 6
101 6/8/17 8:06 AM 7
101 6/8/17 9:03 AM 8
102 6/8/17 10:18 AM 12
102 6/8/17 11:02 AM 10
102 6/8/17 12:55 PM 11
102 6/8/17 3:17 PM 20
102 6/8/17 4:07 PM 21
102 6/8/17 5:48 PM 14
102 6/8/17 7:21 PM 14
102 6/8/17 8:05 PM 13
;
run;
data h1;
set have;
by id;
if dose>15 or (lag(dose)>15 and not first.id) then flag=1;else flag=0;
run;
data want;
do _n_=1 by 1 until (last.flag);
set h1;
by id flag notsorted ;
if _n_=1 and flag then _beg=time;
end;
time_abve_15=intck('hour', _beg,time);
do _n_=1 by 1 until (last.flag);
set h1;
by id flag notsorted ;
output;
end;
drop _beg;
run;
Thank you, Haikuo! The flagging was perfect.
However, for the 1st value of time_abve_15, it should be 07JUN2017:21:56:00 - 07JUN2017:19:39:00
For how long was it above 15, the value of time_abve_15 should be: 08JUN2017:03:14:00 - 07JUN2017:21:56:00
Sorry, if I didn't expalin it properly, but I'm still not able to get for how long is the value of dose > 15.
ID | TIME | DOSE | flag | time_abve_15 |
101 | 07JUN2017:19:39:00 | 14 | 0 | . |
101 | 07JUN2017:20:48:00 | 15 | 0 | . |
101 | 07JUN2017:21:56:00 | 16 | 1 | 6 |
101 | 08JUN2017:00:15:00 | 18 | 1 | 6 |
101 | 08JUN2017:01:12:00 | 20 | 1 | 6 |
101 | 08JUN2017:03:14:00 | 14 | 1 | 6 |
101 | 08JUN2017:06:42:00 | 13 | 0 | . |
101 | 08JUN2017:07:38:00 | 6 | 0 | . |
101 | 08JUN2017:08:06:00 | 7 | 0 | . |
101 | 08JUN2017:09:03:00 | 8 | 0 | . |
102 | 08JUN2017:10:18:00 | 12 | 0 | . |
102 | 08JUN2017:11:02:00 | 10 | 0 | . |
102 | 08JUN2017:12:55:00 | 11 | 0 | . |
102 | 08JUN2017:15:17:00 | 20 | 1 | 2 |
102 | 08JUN2017:16:07:00 | 21 | 1 | 2 |
102 | 08JUN2017:17:48:00 | 14 | 1 | 2 |
102 | 08JUN2017:19:21:00 | 14 | 0 | . |
102 | 08JUN2017:20:05:00 | 13 | 0 | . |
Now I have no clue what you need. The 6 is the length of the interval when dose is above 15 (>15), the unit is 'hour', if you need more accuracy, you can easily switch it to 'minutes'. 6 hours is between first time point when dose>15, which is 07JUN2017:21:56:00, to the last time point where dose > 15, which is 08JUN2017:03:14:00.
" it should be 07JUN2017:21:56:00 - 07JUN2017:19:39:00"? This is when the dose is not above 15, you did not mention you need that as well, but that can be easily done.
The thing is, there is a conflict between your description and your wanted outcome (this part is confusing).
Thank you for your time and solution.
Yes, what I need is a little confusing....
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.