1. From now on, please post you data as a data step (and run the pasted code to ensure it is valid) like this:
data HAVE;
input GROUP $ SEQUENCE $ TRANSACTION DT $ ;
cards;
A 100 1200 jan10
A 101 7400 jan10
A 102 1000 jan10
A 103 7470 jan10
B 201 7400 jan10
B 202 4044 jan10
B 203 4600 jan10
B 204 7470 jan10
B 205 7400 jan10
B 206 7470 jan10
C 301 7400 jan10
C 302 1000 jan10
C 303 7400 jan10
C 304 7470 jan10
D 500 4060 jan10
D 401 7400 jan10
D 403 4320 jan10
D 404 7400 jan10
D 405 7415 jan10
D 406 7420 jan10
D 407 7400 jan10
D 408 7470 jan10
E 500 7400 jan10
E 501 1000 jan10
E 502 4320 jan10
E 503 7400 jan11
E 504 7470 jan11
run;
2. Your Boolean clause in the IF test is unclear because of the OR operator. Do you mean this?
data WANT;
set HAVE;
by GROUP TRANSACTION notsorted;
where TRANSACTION in (7400,7470,4320);
if (last.TRANSACTION and TRANSACTION=7400 and not last.GROUP)
or (first.TRANSACTION and TRANSACTION in(7470,4320) and not first.GROUP);
run;
3. The test validates the values
D 407 7400 jan10
which you state you do not want.
Why should these values be rejected? In other words, what is the condition to reject these values?
Your expected output does not match your logic. Per your logic, you can only have a maximum of two observations per group.
Code according to your logic:
data have;
input group :$1. sequence transaction;
datalines;
A 100 1200
A 101 7400
A 102 1000
A 103 7470
B 201 7400
B 202 4044
B 203 4600
B 204 7470
B 205 7400
B 206 7470
C 301 7400
C 302 1000
C 303 7400
C 304 7470
D 500 4060
D 401 7400
D 403 4320
D 404 7400
D 405 7415
D 406 7420
D 407 7400
D 408 7470
E 500 7400
E 501 1000
E 502 4320
E 503 7400
E 504 7470
;
data want;
firstflag = 0;
lastcount = 0;
counter = 0;
do until (last.group);
set have;
by group;
counter + 1;
if not firstflag and transaction = '7400'
then do;
output;
firstflag = 1;
end;
if transaction in (7470,4320) then lastcount = counter;
end;
counter = 0;
do until (last.group);
set have;
by group;
counter + 1;
if counter = lastcount then output;
end;
drop firstflag lastcount counter;
run;
proc print data=want noobs;
run;
Result:
group sequence transaction A 101 7400 A 103 7470 B 201 7400 B 206 7470 C 301 7400 C 304 7470 D 401 7400 D 408 7470 E 500 7400 E 504 7470
So the transaction date (which was not recognized as its own variable in your initial post, where the data step read only three columns) does play a role.
This only means we have to use two by variables in the crucial step:
data have;
input group :$1. sequence transaction transaction_date yymmdd10.;
format transaction_date yymmddd10.;
datalines;
A 100 1200 2019-01-10
A 101 7400 2019-01-10
A 102 1000 2019-01-10
A 103 7470 2019-01-10
B 201 7400 2019-01-10
B 202 4044 2019-01-10
B 203 4600 2019-01-10
B 204 7470 2019-01-10
B 205 7400 2019-01-10
B 206 7470 2019-01-10
C 301 7400 2019-01-10
C 302 1000 2019-01-10
C 303 7400 2019-01-10
C 304 7470 2019-01-10
D 500 4060 2019-01-10
D 401 7400 2019-01-10
D 403 4320 2019-01-10
D 404 7400 2019-01-10
D 405 7415 2019-01-10
D 406 7420 2019-01-10
D 407 7400 2019-01-10
D 408 7470 2019-01-10
E 500 7400 2019-01-10
E 501 1000 2019-01-10
E 502 4320 2019-01-10
E 503 7400 2019-01-11
E 504 7470 2019-01-11
;
data want;
firstflag = 0;
lastcount = 0;
counter = 0;
do until (last.transaction_date);
set have;
by group transaction_date;
counter + 1;
if not firstflag and transaction = '7400'
then do;
output;
firstflag = 1;
end;
if transaction in (7470,4320) then lastcount = counter;
end;
counter = 0;
do until (last.transaction_date);
set have;
by group transaction_date;
counter + 1;
if counter = lastcount then output;
end;
drop firstflag lastcount counter;
run;
proc print data=want noobs;
run;
Result:
transaction_ group sequence transaction date A 101 7400 2019-01-10 A 103 7470 2019-01-10 B 201 7400 2019-01-10 B 206 7470 2019-01-10 C 301 7400 2019-01-10 C 304 7470 2019-01-10 D 401 7400 2019-01-10 D 408 7470 2019-01-10 E 500 7400 2019-01-10 E 502 4320 2019-01-10 E 503 7400 2019-01-11 E 504 7470 2019-01-11
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.