For the same subject, I want to only return the data row when Time1 > Time2. There's also a limit of 3 data rows for the same subject (ie there can not be 4 data rows of Subject 1 in the input data). The total size of the input data is unknown. Doesn't need to be efficient.
Example Data:
╔═════════╦═══════╦═══════╗
║ Subject ║ Time1 ║ Time2 ║
╠═════════╬═══════╬═══════╣
║ 1 ║ 11:13 ║ 11:12 ║
║ 1 ║ 11:15 ║ 11:12 ║
║ 1 ║ 11:15 ║ 11:14 ║
╚═════════╩═══════╩═══════╝
In this case, Time2 11:14 > Time1 11:13, so it doesn't get outputted.
Expected Output:
╔═════════╦═══════╦═══════╗
║ Subject ║ Time1 ║ Time2 ║
╠═════════╬═══════╬═══════╣
║ 1 ║ 11:13 ║ 11:12 ║
║ 1 ║ 11:15 ║ 11:12 ║
╚═════════╩═══════╩═══════╝
I've brainstormed of isolating the subject where there's an instance of Time2>Time1 and then deleting that Subject from the output, but that would also delete the good data rows. I also know that in SAS there's the .last and .next (or something like that) and I can just manually calculate that using a loop or something?
I'm open to either proc sql/SAS advices, but I am more experienced in SQL but I can't think of a way for this.
Thanks!
Try this:
data have;
input Subject (Time1 Time2) (:time5.);
format time1 time2 time5.;
datalines;
1 11:13 11:12
1 11:15 11:12
1 11:15 11:14
;
proc sql;
select *
from have
group by subject
having time2 < min(time1);
quit;
Subject Time1 Time2 ---------------------- 1 11:13 11:12 1 11:15 11:12
Are your variables SAS time valued variables (numeric with a format of HHMM, TIME or TOD of some form) or character?
If the values are SAS time values this could be as simple as:
Data want;
set have;
where time1 > time2;
run;
If they are not SAS time valued variables then likely create actual time valued variables and compare those.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
Try this:
data have;
input Subject (Time1 Time2) (:time5.);
format time1 time2 time5.;
datalines;
1 11:13 11:12
1 11:15 11:12
1 11:15 11:14
;
proc sql;
select *
from have
group by subject
having time2 < min(time1);
quit;
Subject Time1 Time2 ---------------------- 1 11:13 11:12 1 11:15 11:12
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.