Taking your first dataset (the second has different values), this code removes rows 5, 18 and 30:
data have;
length Value 8 Location $40 Time 8;
infile cards dsd dlm=':' truncover;
input Value Location Minutes Seconds;
time = minutes * 60 + seconds;
format time time11.2;
drop minutes seconds;
cards;
17.000: Germany: 1:45.0
17.001: Germany: 1:45.1
17.220: Germany: 1:45.2
18.901: Germany: 1:45.3
1.091: Germany: 1:45.4
19.021: Germany: 1:45.5
16.023: Germany: 1:45.6
17.504: Germany: 1:45.7
17.600: Germany: 1:45.8
17.760: Germany: 1:45.9
19.759: Germany: 1:45.10
6.000: Germany: 1:45.11
2.454: Germany: 1:45.12
1.665: Germany: 1:45.13
0.435: Germany: 1:45.14
0.091: Germany: 1:45.15
0.654: Germany: 1:45.16
53.324: Germany: 1:45.17
0.554: Germany: 1:45.18
1.670: Germany: 1:45.19
3.050: Germany: 1:45.20
5.659: Germany: 1:45.21
17.540:Germany: 1:45.22
17.67: Germany: 1:45.23
17.546:Germany: 1:45.24
18.434:Germany: 1:45.25
16.655:Germany: 1:45.26
19.315:Germany: 1:45.27
16.323:Germany: 1:45.28
1.545:Germany: 1:45.29
19.323:Germany: 1:45.30
17.234:Germany: 1:45.31
22.215:Germany: 1:45.32
;
data want;
merge
have
have (firstobs=2 keep=value rename=(value=nvalue))
;
lvalue = lag(value);
if lvalue ne . and nvalue ne .
then do;
if
lvalue <= 6 and nvalue <= 6 and value > 6
then del = 1;
else if
lvalue > 6 and nvalue > 6 and
(value < 6 or value > (nvalue + lvalue))
then del = 1;
end;
if not del;
drop nvalue lvalue del;
run;
The MERGE of the dataset with itself, but with FIRSTOBS=2, creates a "look-ahead" for value. Combined with using the LAG() function, I now have the preceding and succeeding value present for every observation.
The first IF prevents the test being made for the first and last observation, as I either miss the lagged or following value. Then I do tests for both types of condition (a "spike up" when the values are low, or a "spike down" when values are high) and set a flag variable. I also test for an unnaturally high value within the second condition.
After the tests, I use the flag variable to prevent the output of the current observation.
Okay, okay that makes sense already. Sound almost clear. Just something about the IF statements. Why do you use a '6' in the if statements?
And, if i'm right. With the last piece of code you are actually saying "If there is a DEL = 1, then drop that whole row"
Something else. I did something as an exercise for myself. I have made negative numbers of the rows 5, 18 and 30. When I execute the code now, it only drops row 5 and 30. The same applies for when I make them all 3 blank fields, it only drops 5 and 30.
Regarding your first question, I refer to your earlier post:
"No All lows must be below 6 for Germany as wel as the other locations."
So I chose 6 as an arbitrary cutoff value between a "low" and "high" segment.
Second question: this is a Subsetting IF, and should read "keep the observation if the del flag is not set to true"
Third question: carefully study the IF that checks the "high" segment; in it you find the condition that checks for an unnaturally high value (larger than the sum of preceding + succeeding). You can expand the IF that checks the "low" segment to check for an unnaturally low value (e. g. <= 0) in a similar fashion.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.