Hi,
Here's the problem I am trying to solve:
Suppose you have 3 variables, Time, Upper, and Lower. Time is the moment of a reading, Upper is whether that reading is above (1) or below (0) the upper limit, and Lower is whether that reading is above (1) or below (0) the lower limit. Using those three variables, I am trying to compute a Target binary variable with the following logic:
If lag(Target) = 0 then Target = Upper else Target = Lower
As an example, think about tank that is stores water and needs to discharge if the level gets too high. So if the level gets above an upper limit, a pump is turned on until the level falls below the lower limit. There is no way to predict how long the tank is above the lower limit and no way to determine how long it takes to discharge water below the lower limit. To me, this is row by row processing using the previous state to determine limit (upper or lower) should be used to determine whether the pump is turned on or off.
I have attached an example data set to illustrate the variables and Target outcome I am looking to compute.
Thanking you in advance,
rg
There are too many scenarios you need consider about. data have; infile cards truncover expandtabs; input Time Upper Lower Target; cards; 1 0 0 0 2 0 1 0 3 0 1 0 4 0 1 0 5 0 1 0 6 0 1 0 7 0 1 0 8 0 1 0 9 0 1 0 10 1 1 1 11 1 1 1 12 1 1 1 13 0 1 1 14 0 1 1 15 0 1 1 16 0 1 1 17 0 0 0 18 0 0 0 19 0 0 0 20 0 0 0 21 0 1 0 22 0 1 0 23 0 1 0 24 1 1 1 25 1 1 1 26 1 1 1 27 1 1 1 28 1 1 1 29 0 1 1 30 0 1 1 31 0 0 0 32 0 0 0 33 0 0 0 34 0 0 0 35 0 0 0 ; run; data want; set have; retain _target 0; if upper=1 then _target=1; if lower=0 then _target=0; run;
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
This link: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... has instructions on how to create program text for a datastep to create the data that can be posted here. Then interested readers can test behavior with your data.
I may think somewhat differently but if your Upper value indicates out of expected range (above the upper limit) why doesn't the Lower value have 1 when it is out of range (below the lower limit)?
Thanks for the link on how to format questions for the Community environment. I will review and apply to future posts.
That said, the lower limit is 0 when it is out of range and the values could be reversed-coded if you prefer. The problem is that while the upper limit is an explicit start of the event (turning on the pump), there is that curious state where the level is above the lower and below the upper limit. In the curious state, if the pump is NOT on, then leave it OFF (not a critical level) and if the pump is ON, the leave it ON (to expected level of the lower limit). The two-step limit approach may help minimize costs or risks associated with turning on the pump and associated systems based upon a single limit.
The problem as described is a general class of event detections and the example is simply an applied story to the data. Nonetheless, the Target variable value is determined by the conditional evaluation of the previous state (lag(Target)) in order to use the upper or lower limit values to update the current state.
I am new to SAS and looking for help on how to program this Target variable; either in the datastep using retain statements or in ds2 using threading (or other methods unknown at this time).
Does this make sense?
Please look closely at the documentation for LAG. There can be quite a bit of complication when lag gets involved with IF.
In many cases something like:
LagTarget=lag(target);
If LagTarget = 0 then Target = Upper;
else Target = Lower;
often addresses most issues.
What is your rule for the very first record? The lagged value will be missing.
There are too many scenarios you need consider about. data have; infile cards truncover expandtabs; input Time Upper Lower Target; cards; 1 0 0 0 2 0 1 0 3 0 1 0 4 0 1 0 5 0 1 0 6 0 1 0 7 0 1 0 8 0 1 0 9 0 1 0 10 1 1 1 11 1 1 1 12 1 1 1 13 0 1 1 14 0 1 1 15 0 1 1 16 0 1 1 17 0 0 0 18 0 0 0 19 0 0 0 20 0 0 0 21 0 1 0 22 0 1 0 23 0 1 0 24 1 1 1 25 1 1 1 26 1 1 1 27 1 1 1 28 1 1 1 29 0 1 1 30 0 1 1 31 0 0 0 32 0 0 0 33 0 0 0 34 0 0 0 35 0 0 0 ; run; data want; set have; retain _target 0; if upper=1 then _target=1; if lower=0 then _target=0; run;
Thank you!
The code worked for this simulated data as well as the more complex events I am modeling.
That said, I am not one to ignore warnings :), what are your thinking about regarding the "many scenarios you need to consider about"?
That depends on what you want. E.X. if Upper=1 Lower=0, what you gonna do .
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.