Solved: Re: Filter only which fits a specific trend - Page 2

Kurt_Bremser · Posted 05-01-2020 06:38 AM

Taking your first dataset (the second has different values), this code removes rows 5, 18 and 30:

data have;
  length Value 8 Location $40 Time 8;
  infile cards dsd dlm=':' truncover;
  input Value Location Minutes Seconds;
  time = minutes * 60 + seconds;
  format time time11.2;
  drop minutes seconds;
cards;
17.000: Germany: 1:45.0
17.001: Germany:  1:45.1
17.220: Germany:  1:45.2
18.901: Germany: 1:45.3
1.091: Germany: 1:45.4
19.021: Germany: 1:45.5
16.023: Germany: 1:45.6
17.504: Germany: 1:45.7
17.600: Germany: 1:45.8
17.760: Germany: 1:45.9
19.759: Germany: 1:45.10
6.000: Germany: 1:45.11
2.454: Germany: 1:45.12
1.665: Germany: 1:45.13
0.435: Germany: 1:45.14
0.091: Germany: 1:45.15
0.654: Germany: 1:45.16
53.324: Germany: 1:45.17
0.554: Germany: 1:45.18
1.670: Germany: 1:45.19
3.050: Germany: 1:45.20
5.659: Germany: 1:45.21
17.540:Germany: 1:45.22
17.67: Germany: 1:45.23
17.546:Germany:  1:45.24
18.434:Germany:  1:45.25
16.655:Germany:  1:45.26
19.315:Germany:  1:45.27
16.323:Germany:  1:45.28
1.545:Germany:  1:45.29
19.323:Germany:  1:45.30
17.234:Germany:  1:45.31
22.215:Germany:  1:45.32
;

data want;
merge
  have
  have (firstobs=2 keep=value rename=(value=nvalue))
;
lvalue = lag(value);
if lvalue ne . and nvalue ne .
then do;
  if
    lvalue <= 6 and nvalue <= 6 and value > 6
  then del = 1;
  else if
    lvalue > 6 and nvalue > 6 and
    (value < 6 or value > (nvalue + lvalue))
  then del = 1;
end;
if not del;
drop nvalue lvalue del;
run;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

AK100 · Posted 05-06-2020 04:26 AM

The code did exactly what it had to do, thanks for that. But I was just wondering what the code are exactly doing?

Kurt_Bremser · Posted 05-06-2020 04:51 AM

The MERGE of the dataset with itself, but with FIRSTOBS=2, creates a "look-ahead" for value. Combined with using the LAG() function, I now have the preceding and succeeding value present for every observation.

The first IF prevents the test being made for the first and last observation, as I either miss the lagged or following value. Then I do tests for both types of condition (a "spike up" when the values are low, or a "spike down" when values are high) and set a flag variable. I also test for an unnaturally high value within the second condition.

After the tests, I use the flag variable to prevent the output of the current observation.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

AK100 · Posted 05-06-2020 05:09 AM

Okay, okay that makes sense already. Sound almost clear. Just something about the IF statements. Why do you use a '6' in the if statements?

And, if i'm right. With the last piece of code you are actually saying "If there is a DEL = 1, then drop that whole row"

Something else. I did something as an exercise for myself. I have made negative numbers of the rows 5, 18 and 30. When I execute the code now, it only drops row 5 and 30. The same applies for when I make them all 3 blank fields, it only drops 5 and 30.

Kurt_Bremser · Posted 05-06-2020 05:21 AM

Regarding your first question, I refer to your earlier post:

"No All lows must be below 6 for Germany as wel as the other locations."

So I chose 6 as an arbitrary cutoff value between a "low" and "high" segment.

Second question: this is a Subsetting IF, and should read "keep the observation if the del flag is not set to true"

Third question: carefully study the IF that checks the "high" segment; in it you find the condition that checks for an unnaturally high value (larger than the sum of preceding + succeeding). You can expand the IF that checks the "low" segment to check for an unnaturally low value (e. g. <= 0) in a similar fashion.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Re: Filter only which fits a specific trend

Re: Filter only which fits a specific trend

Re: Filter only which fits a specific trend

Re: Filter only which fits a specific trend

Re: Filter only which fits a specific trend

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away