Dear SAS New User,
Please can help with this problem, it trying to detect and enumerate drops in relative humidity (RH) by geographical location. The drop needs enumerating for later analysis.
Defining drop in RH follows rules:
1) to be a drop in RH it must have 3 consecutive datapoint in humidity below a threshold (less than or equal 30%).
2) The drop in humidity finishes stop if there are 3 consecutive points above threshold (31% and above).
3) Then on the next started drop (within that location), it labels drop as +1.
See data I given. Very grateful for help! I try retain and count but I never manage.
datetime | loc | RH (%) | RH drop number |
21NOV20:00:59 | a | 33 | . |
21NOV20:01:04 | a | 31 | . |
21NOV20:01:09 | a | 30 | . |
21NOV20:01:14 | a | 30 | . |
21NOV20:01:19 | a | 29 | 1 |
21NOV20:01:24 | a | 28 | 1 |
21NOV20:01:29 | a | 27 | 1 |
21NOV20:01:34 | a | 27 | 1 |
21NOV20:01:39 | a | 27 | 1 |
21NOV20:01:44 | a | 26 | 1 |
21NOV20:01:49 | a | 28 | 1 |
21NOV20:01:54 | a | 27 | 1 |
21NOV20:01:59 | a | 26 | 1 |
21NOV20:02:04 | a | 25 | 1 |
21NOV20:02:09 | a | 24 | 1 |
21NOV20:02:14 | a | 23 | 1 |
21NOV20:02:19 | a | 21 | 1 |
21NOV20:02:24 | a | 22 | 1 |
21NOV20:02:29 | a | 18 | 1 |
21NOV20:02:34 | a | 20 | 1 |
21NOV20:02:39 | a | 27 | 1 |
21NOV20:02:44 | a | 29 | 1 |
21NOV20:02:49 | a | 32 | . |
21NOV20:02:54 | a | 31 | . |
21NOV20:02:59 | a | 32 | . |
21NOV20:03:04 | a | 29 | . |
21NOV20:03:09 | a | 29 | . |
21NOV20:03:14 | a | 29 | 2 |
21NOV20:03:19 | a | 28 | 2 |
21NOV20:03:24 | a | 28 | 2 |
21NOV20:03:29 | a | 29 | 2 |
21NOV20:03:34 | a | 28 | 2 |
Please provide data as working SAS data step code (examples and instructions)
How can row 5 show (RH Drop Number) = 1 since there have not been "3 consecutive datapoint in humidity below a threshold (30%)" ???
@PaigeMiller wrote:
Please provide data as working SAS data step code (examples and instructions)
How can row 5 show (RH Drop Number) = 1 since there have not been "3 consecutive datapoint in humidity below a threshold (30%)" ???
data rh;
input datetime :datetime. loc $ RH RHDrop;
cards;
21NOV20:00:59 a 33 .
21NOV20:01:04 a 31 .
21NOV20:01:09 a 30 .
21NOV20:01:14 a 30 .
21NOV20:01:19 a 29 1
21NOV20:01:24 a 28 1
21NOV20:01:29 a 27 1
21NOV20:01:34 a 27 1
21NOV20:01:39 a 27 1
21NOV20:01:44 a 26 1
21NOV20:01:49 a 28 1
21NOV20:01:54 a 27 1
21NOV20:01:59 a 26 1
21NOV20:02:04 a 25 1
21NOV20:02:09 a 24 1
21NOV20:02:14 a 23 1
21NOV20:02:19 a 21 1
21NOV20:02:24 a 22 1
21NOV20:02:29 a 18 1
21NOV20:02:34 a 20 1
21NOV20:02:39 a 27 1
21NOV20:02:44 a 29 1
21NOV20:02:49 a 32 .
21NOV20:02:54 a 31 .
21NOV20:02:59 a 32 .
21NOV20:03:04 a 29 .
21NOV20:03:09 a 29 .
21NOV20:03:14 a 29 2
21NOV20:03:19 a 28 2
21NOV20:03:24 a 28 2
21NOV20:03:29 a 29 2
21NOV20:03:34 a 28 2
;;;;
run;
proc print;
run;
Thank you making SAS data. It is less than or equal 30%.
@linlin87 wrote:
Thank you making SAS data. It is less than or equal 30%.
You need to address @PaigeMiller question.
How can row 5 show (RH Drop Number) = 1 since there have not been "3 consecutive datapoint in humidity below a threshold (30%)" ???
You are not giving enough information for me to understand the problem.
At row 5, there are 3 consecutive datapoint. 30 (row 3), 30 (row 4) and then the current row, which is 29 (row 5).
To be a drop in RH above noise it need 3 datapoint below 30, and to be end of drop in RH it need 3 datapoint out of 30. Hope make sense now?
@linlin87 wrote:
To be a drop in RH above noise it need 3 datapoint below 30, and to be end of drop in RH it need 3 datapoint out of 30. Hope make sense now?
Let me know if I understand correctly.
data rh;
input datetime :datetime. loc $ RH RHDrop;
format datetime datetime.;
length DropFl 8;
fl = rh le 30;
if fl
then seq + 1;
else seq = .;
if seq eq 3 then TempDropFl + 1;
if not fl
then DropFl = .;
else if seq ge 3 then DropFL = TempDropFl;
*drop fl seq Temp:;
cards;
21NOV20:00:59 a 33 .
21NOV20:01:04 a 31 .
21NOV20:01:09 a 30 .
21NOV20:01:14 a 30 .
21NOV20:01:19 a 29 1
21NOV20:01:24 a 28 1
21NOV20:01:29 a 27 1
21NOV20:01:34 a 27 1
21NOV20:01:39 a 27 1
21NOV20:01:44 a 26 1
21NOV20:01:49 a 28 1
21NOV20:01:54 a 27 1
21NOV20:01:59 a 26 1
21NOV20:02:04 a 25 1
21NOV20:02:09 a 24 1
21NOV20:02:14 a 23 1
21NOV20:02:19 a 21 1
21NOV20:02:24 a 22 1
21NOV20:02:29 a 18 1
21NOV20:02:34 a 20 1
21NOV20:02:39 a 27 1
21NOV20:02:44 a 29 1
21NOV20:02:49 a 32 .
21NOV20:02:54 a 31 .
21NOV20:02:59 a 32 .
21NOV20:03:04 a 29 .
21NOV20:03:09 a 29 .
21NOV20:03:14 a 29 2
21NOV20:03:19 a 28 2
21NOV20:03:24 a 28 2
21NOV20:03:29 a 29 2
21NOV20:03:34 a 28 2
;;;;
run;
proc print;
run;
This is what it is, thank you @data_null__ you awesome!
Please you explain how it work line-by-line to me?
I ask more please. How can I also say that I need the 3 consecutive points within 15 mins (datetime has some variation).
Do I RETAIN datetime and edit your line of code as this?
if seq eq 3 and lag2(datetime) le 15 then TempDropFl + 1;
Sorry this
if seq eq 3 and (datetime-lag2(datetime)) le 15 then TempDropFl + 1;
Also this work BY loc? Starting DropFL from 1 on new loc? I try on data.
I now test this and it perform good. But one problem I really appreciate your help with @data_null__ . When you have the end of the drop, if within three values you get another fl=1 then it must continue to label the drop. See orange below and "Want". I have given this data so you can try.
I try edit code but no success so very grateful and want to see how you do.
datetime | loc | RH | temp | fl | seq | TempDropFl | DropFl | Want |
30/11/2024 10:37 | Nuneaton | 35 | . | 0 | . | 4 | . | |
30/11/2024 10:42 | Nuneaton | 37 | . | 0 | . | 4 | . | |
30/11/2024 10:47 | Nuneaton | 31 | . | 0 | . | 4 | . | |
30/11/2024 10:52 | Nuneaton | 32 | . | 0 | . | 4 | . | |
30/11/2024 10:57 | Nuneaton | 32 | . | 0 | . | 4 | . | |
30/11/2024 11:02 | Nuneaton | 28 | . | 1 | 1 | 4 | . | |
30/11/2024 11:07 | Nuneaton | 36 | . | 0 | . | 4 | . | |
30/11/2024 11:12 | Nuneaton | 33 | . | 0 | . | 4 | . | |
30/11/2024 11:18 | Nuneaton | 32 | . | 0 | . | 4 | . | |
30/11/2024 11:23 | Nuneaton | 38 | . | 0 | . | 4 | . | |
30/11/2024 11:28 | Nuneaton | 30 | . | 1 | 1 | 4 | . | |
30/11/2024 11:33 | Nuneaton | 28 | . | 1 | 2 | 4 | . | |
30/11/2024 11:38 | Nuneaton | 29 | . | 1 | 3 | 5 | 5 | 5 |
30/11/2024 11:43 | Nuneaton | 35 | . | 0 | . | 5 | . | 5 |
30/11/2024 11:48 | Nuneaton | 32 | . | 0 | . | 5 | . | 5 |
30/11/2024 11:53 | Nuneaton | 30 | . | 1 | 1 | 5 | . | 5 |
30/11/2024 11:58 | Nuneaton | 32 | . | 0 | . | 5 | . | |
30/11/2024 12:03 | Nuneaton | 33 | . | 0 | . | 5 | . | |
30/11/2024 12:08 | Nuneaton | 33 | . | 0 | . | 5 | . |
Maybe SAS is not good for doing this type of tasks? Do you recommend different platform @data_null__ for doing this difficult label of events in timeseries? I very happy to listen to recommendation. Would SAS Viya handle better?
@linlin87 sorry I was not able to offer any further input. Perhaps if you ask the question again some of the better programmers can help you figure this.
@linlin87 wrote:
Maybe SAS is not good for doing this type of tasks? Do you recommend different platform @data_null__ for doing this difficult label of events in timeseries? I very happy to listen to recommendation. Would SAS Viya handle better?
SAS Viya is essentially SAS 9.4 with many additional features and greater computational power; but this is a programming issue, not a platform issue, not a feature issue. You, or someone, needs to figure out the right method of programming to achieve the desired results, and then I'm sure SAS can do the job.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.