Hi all,
i have a dataset as below
data have;
input col1 col2;
datalines;
1001 1.5
1001 1
1001 2.5
1001 2.5
1001 1.5
1001 1.5
1001 1.5
1002 1.5
1002 4.25
1002 2.5
1002 2.5
1002 2.5
1002 2.5
1002 1.5
1003 6.5
1003
1003 1.5
1003 1.5
1003 2.5
1003 2.5
1003 2.5
1003
;
run;
from the above I have to filter out the rows that have repeats of values at least 3 times
for the first category(1001), I need to flag the 5th row as that row got repeated with 1.5 with out any blank/any other value even though 1.5 is present on the first row(as it's not repetitive). similarly row no 10 should be flagged for 1002 as repeats are present at least 3 times there.
any help will be appriciated.
So we need another flag which keeps track of repeats found:
data want;
retain flag2;
count = 0;
do until (last.col2);
set have;
by col1 col2 notsorted;
if first.col1 then flag2 = 1;
count + 1;
end;
flag = (count ge 3) and flag2;
if flag then flag2 = 0;
do until (last.col2);
set have;
by col1 col2 notsorted;
output;
flag = 0;
end;
drop count flag2;
run;
data want;
count = 0;
do until (last.col2);
set have;
by col1 col2 notsorted;
count + 1;
end;
flag = (count ge 3);
do until (last.col2);
set have;
by col1 col2 notsorted;
output;
flag = 0;
end;
drop count;
run;
Untested, posted from my tablet.
Thank you both. @Kurt_Bremser it worked, but I realized this code is flagging every 1st occurrence if there are multiple triplicate recs present per category of COL1,
I'd really appreciate if you can help with an idea- if the triplicate repeats found in col2 first time per category of col1, then not to consider all other triplicate repeats present afterwords(and not to consider missing triplicates if coming on the way).
Sorry, I should have mentioned about this condition, apologies, should have been clearer earlier.
So we need another flag which keeps track of repeats found:
data want;
retain flag2;
count = 0;
do until (last.col2);
set have;
by col1 col2 notsorted;
if first.col1 then flag2 = 1;
count + 1;
end;
flag = (count ge 3) and flag2;
if flag then flag2 = 0;
do until (last.col2);
set have;
by col1 col2 notsorted;
output;
flag = 0;
end;
drop count flag2;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.