data test;
format dt date9;
format dt2 date9;
input dt dt2;
datalines;
20000 20001
20000 20002
20000 20003
21000 21001
21000 21002
21000 21003
21000 21004
21000 21005
;
run;
proc sort data = test;
by dt dt2;
run;
data check;
set test;
by dt dt2;
if last.dt = first.dt then
if abs(last.dt2 - first.dt) < 5 then delete;
run;
What I would like to happen is the section of
20000 20001 20000 20002 20000 20003
dates to all be deleted and the next section
21000 21001 21000 21002 21000 21003 21000 21004 21000 21005
to all be retained. How can I get this to work?
Here is a solution I have that worked now:
data check;
set test;
by dt dt2;
format dt dt2 date9.;
diff = abs(dt - dt2);
if diff < 5 then delete;
if last.dt then output;
run;
You can assign a counter variable, and limit the output to the first group of dt.
data check;
set test;
by dt dt2;
format dt dt2 mmddyy10.;
if first.dt then counter+1;
if counter=2;
run;
Your suggestion doesn't yield the desired outcome. This is also a toy problem so we won't have the luxury of saying counter occurrence # is desired.
This condition:
if last.dt = first.dt
can only be true when there is only one observation in the current BY (dt) group. When you process the last observation of a by group containing multiple observations, only last. is true, and in the next observation only first. will be true.
How can I do what I am looking to do though?
Here is a solution I have that worked now:
data check;
set test;
by dt dt2;
format dt dt2 date9.;
diff = abs(dt - dt2);
if diff < 5 then delete;
if last.dt then output;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.