I need to create a variable that gives me the "end" time of an event. Specifically, I need to create a "phase 1" variable that is based on the variables measurement 1-4 and datetime 1-4. To do this, I need to extract the datetime associated with the last measurement <= 5.
For example, my phase_1 variable for Obs #2 would be 22JAN19:13:33:00.
Here's an example of what my data look like:
obs | datetime_1 | measurement_1 | datetime_2 | measurement_2 | datetime_3 | measurement_3 | datetime_4 | measurement_4 |
1 | 17APR19:22:56:00 | 3 | 18APR19:05:05:00 | 4 | 18APR19:06:35:00 | 4 | 18APR19:07:04:00 | 5 |
2 | 22JAN19:11:40:00 | 4 | 22JAN19:13:33:00 | 5 | 22JAN19:15:29:00 | 9 | 22JAN19:16:25:00 | 10 |
3 | 03MAY15:19:37:00 | 0 | 04MAY15:04:30:00 | 3 | 04MAY15:07:22:00 | 6 | 04MAY15:10:53:00 | 8 |
4 | 18FEB20:01:14:00 | 1 | 20FEB20:03:05:00 | 1 | 20FEB20:04:17:00 | 4 | 20FEB20:11:00:00 | 7 |
Before we go too far into this do you ever have more than 4 grouped variables?
Are your measure values monotonic (always increasing, or at least not decreasing), the datetimes as well?
Do have a need for a phase_2 or phase_3? Different boundaries other than 5?
An approach that works for a single number of variables with one boundary may not be flexible enough to extend to a more complex case.
Untested but this should get you started:
data want; set have; array d (*) datetime: ; array m (*) measure: ; do i=dim(m) to 1 by -1; if m[i] le 5 then do; phase_1= d[i]; leave; end; end; run;
The arrays start at the highest indexed value assuming it corresponds to the latest datetime value. The first value less than or equal 5 pulls the corresponding date time. The LEAVE statement says to quit executing the loop the first time a value meeting the condition is found.
If none of the values are less than or equal to 5 no value is assigned to phase_1.
Before we go too far into this do you ever have more than 4 grouped variables?
Are your measure values monotonic (always increasing, or at least not decreasing), the datetimes as well?
Do have a need for a phase_2 or phase_3? Different boundaries other than 5?
An approach that works for a single number of variables with one boundary may not be flexible enough to extend to a more complex case.
Untested but this should get you started:
data want; set have; array d (*) datetime: ; array m (*) measure: ; do i=dim(m) to 1 by -1; if m[i] le 5 then do; phase_1= d[i]; leave; end; end; run;
The arrays start at the highest indexed value assuming it corresponds to the latest datetime value. The first value less than or equal 5 pulls the corresponding date time. The LEAVE statement says to quit executing the loop the first time a value meeting the condition is found.
If none of the values are less than or equal to 5 no value is assigned to phase_1.
I get a blank (.) value for phase_1 using this code -
data want;
set have;
array d (*) exam_date: ;
array m (*) exam_measure: ;
do i=dim(m) to 1 by -1;
if m[i] le 5 then do;
phase_1= d[i];
leave;
end;
end;
run;
@mjalvarez wrote:
I get a blank (.) value for phase_1 using this code -
data want;
set have;array d (*) exam_date: ;
array m (*) exam_measure: ;
do i=dim(m) to 1 by -1;
if m[i] le 5 then do;
phase_1= d[i];
leave;
end;
end;
run;
And the contents of the data set have is what? As I mentioned my code is untested because you did not provide an actual data set and you need to show which values actually create missing. I did say if none of the measure values are less than or equal to 5 then the phase variable will not be assigned.
Since your code above also uses different variable names then were implied by your example data that is something else to look at: the spelling of the variables.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.