So, I'm trying to output the data rows where given the same TimePoint, the difference Time1 between from the lowest time and the middle time as well as the middle time and the highest time should be greater than 2 minutes. For every given TimePoint, there will only be 3 Time1s.
Sample Input Data:
╔═════════╦══════════╦════════════╗
║ Subject ║ Time1 ║ TimePoint ║
╠═════════╬══════════╬════════════╣
║ 1 ║ 11:13:10 ║ s1 ║
║ 1 ║ 11:15:10 ║ s1 ║
║ 1 ║ 11:17:30 ║ s1 ║
║ 1 ║ 11:45 ║ s2 ║
║ 1 ║ ║ s2 ║
║ 1 ║ 11:50 ║ s2 ║
╚═════════╩══════════╩════════════╝
Output Data:
╔═════════╦══════════╦════════════╗
║ Subject ║ Time1 ║ TimePoint ║
╠═════════╬══════════╬════════════╣
║ 1 ║ 11:15:10 ║ s1 ║
║ 1 ║ 11:17:30 ║ s1 ║
║ 1 ║ 11:45 ║ s2 ║
║ 1 ║ 11:50 ║ s2 ║
╚═════════╩══════════╩════════════╝
Explanation:
Highest time - Middle time: 11:17:30 - 11:15:10 > 2 (should output)
Middle time - Lowest time:
11:13 - 11:11 <= 2 minutes (should not output)
Highest time - Lowest time: 11:50 - 11:45 > 4min (should output)
^this means that whatever the middle time is, at least one of the interval is > 2min, so should output
---------------------------------------------------------------------------------------
I'm open to either proc sql/SAS advices, but I am more experienced in SQL but I can't think of a way for this. This would be a lot easier if it's just two Time1s. I do know of the INTCK function to get the time difference, but I'm stuck on getting it to apply in this manner.
I did a quick google search on this, including search queries like "sas sql get middle time between three 3 times" but nothing really relevant came up.
Thanks!
EDIT: Thinking about it now, I think the best way to approach this is to compile all three times onto one single row into their own column, along with the other data (subject and timepoint). Then I can just compare by doing like TimeONE - TimeTwo and so on with the columns.
Is there a better way than this? But now I'm kind of stuck on how to get each Time onto their own column:/
╔═════════╦══════════╦══════════╦═══════════╦═══════════╗
║ Subject ║ TimeONE ║ TimeTWO ║ TimeTHREE ║ TimePoint ║
╠═════════╬══════════╬══════════╬═══════════╬═══════════╣
║ 1 ║ 11:13:10 ║ 11:15:10 ║ 11:17:30 ║ s1 ║
Use the DIF() operator in conjunction with BY group to reset your counters, but I think you need to expand your sample data. Can you include one where you have times but they would not be included, not just ones that are missing times? And do you have multiple subjects and multiple time points?
Assuming you'll have multiple subject and time points this can get you started. It's untested.
data want;
set have;
by subject timepoint time1;
time_dif = dif(time1);
if first.subject or first.timepoint then time_dif = .;
if time_dif < 120 or missing(time1) then delete;
run;
@jerrylshen wrote:
So, I'm trying to output the data rows where given the same TimePoint, the difference Time1 between from the lowest time and the middle time as well as the middle time and the highest time should be greater than 2 minutes. For every given TimePoint, there will only be 3 Time1s.
Sample Input Data:
╔═════════╦══════════╦════════════╗ ║ Subject ║ Time1 ║ TimePoint ║ ╠═════════╬══════════╬════════════╣ ║ 1 ║ 11:13:10 ║ s1 ║ ║ 1 ║ 11:15:10 ║ s1 ║ ║ 1 ║ 11:17:30 ║ s1 ║ ║ 1 ║ 11:45 ║ s2 ║ ║ 1 ║ ║ s2 ║ ║ 1 ║ 11:50 ║ s2 ║ ╚═════════╩══════════╩════════════╝
Output Data:
╔═════════╦══════════╦════════════╗ ║ Subject ║ Time1 ║ TimePoint ║ ╠═════════╬══════════╬════════════╣ ║ 1 ║ 11:15:10 ║ s1 ║ ║ 1 ║ 11:17:30 ║ s1 ║ ║ 1 ║ 11:45 ║ s2 ║ ║ 1 ║ 11:50 ║ s2 ║ ╚═════════╩══════════╩════════════╝
Explanation:
Highest time - Middle time: 11:17:30 - 11:15:10 > 2 (should output)
Middle time - Lowest time:
11:13 - 11:11 <= 2 minutes (should not output)
Highest time - Lowest time: 11:50 - 11:45 > 4min (should output)
^this means that whatever the middle time is, at least one of the interval is > 2min, so should output
---------------------------------------------------------------------------------------
I'm open to either proc sql/SAS advices, but I am more experienced in SQL but I can't think of a way for this. This would be a lot easier if it's just two Time1s. I do know of the INTCK function to get the time difference, but I'm stuck on getting it to apply in this manner.
I did a quick google search on this, including search queries like "sas sql get middle time between three 3 times" but nothing really relevant came up.
Thanks!
EDIT: Thinking about it now, I think the best way to approach this is to compile all three times onto one single row into their own column, along with the other data (subject and timepoint). Then I can just compare by doing like TimeONE - TimeTwo and so on with the columns.
Is there a better way than this? But now I'm kind of stuck on how to get each Time onto their own column:/
╔═════════╦══════════╦══════════╦═══════════╦═══════════╗ ║ Subject ║ TimeONE ║ TimeTWO ║ TimeTHREE ║ TimePoint ║ ╠═════════╬══════════╬══════════╬═══════════╬═══════════╣ ║ 1 ║ 11:13:10 ║ 11:15:10 ║ 11:17:30 ║ s1 ║
Hi, thanks for a quick reply! I'll check it out, thanks for a starting point!
There is an instance where all the times aren't blank but they don't shouldn't output - it's TimePoint s1, where the lowest and middle time isn't greater than 2 minutes.
Yeah, I would have multiple subjects and timepoints.
Hello @jerrylshen
The below will get the min,max remerge(automatic) done in each row for a by group id,timepoint. Beyond that, I haven't understand your explanation of the logic i.e which ones to keep /drop. However, you could get your SQL skills to think in terms of CASE WHEN,IFN, or Boolean in a HAVING. If you can elabroate, i can provide that too. Run a look at the output to gauge what i am saying.
data have;
input Subject Time1 :time10. TimePoint $;
format time1 time.;
cards;
1 11:13:10 s1
1 11:15:10 s1
1 11:17:30 s1
1 11:45 s2
1 . s2
1 11:50 s2
;
proc sql;
create table want as
select *,min(time1) as min, max(time1) as max
from have
group by subject,timepoint
order by monotonic();
quit;
@jerrylshen wrote:
EDIT: Thinking about it now, I think the best way to approach this is to compile all three times onto one single row into their own column, along with the other data (subject and timepoint). Then I can just compare by doing like TimeONE - TimeTwo and so on with the columns.
Is there a better way than this? But now I'm kind of stuck on how to get each Time onto their own column:/
╔═════════╦══════════╦══════════╦═══════════╦═══════════╗ ║ Subject ║ TimeONE ║ TimeTWO ║ TimeTHREE ║ TimePoint ║ ╠═════════╬══════════╬══════════╬═══════════╬═══════════╣ ║ 1 ║ 11:13:10 ║ 11:15:10 ║ 11:17:30 ║ s1 ║
Transposing data tutorials:
Long to Wide:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-long-to-wide-using-the-data-step/
Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/
And sometimes a double transpose is needed for extra wide data sets:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.