Dear All,
Could you please help me with the following problem? I have dataset that looks like this:
Company Date Var1
1 1 20
1 2 20
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 5
3 5 5
3 6 5
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
I need to obtain the following dataset:
Company Date Var1
1 1 0
1 2 0
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 0
3 5 0
3 6 0
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
In other words, for each company I need to replace with zero those values that are repeated more than three times. I want to leave these three repeated values and remaining repeated values replace with zero. I also think that values that are repeated more than three times are usually placed at the beginning of each company.
Any help would be hugely appreciated.
data have;
input Company Date Var1 ;
cards;
1 1 20
1 2 20
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 5
3 5 5
3 6 5
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
;
run;
data want;
n=0;
do until(last.Var1);
set have;
by Company Var1 notsorted;
n+1;
end;
m=0 ;
do until(last.Var1);
set have;
by Company Var1 notsorted;
m+1;
if n gt 3 and m le (n-3) then do;Var1=0;output;end;
else output;
end;
drop m n;
run;
Does it matter if it's the latest or earliest that are replaced?
proc sort data=have;
by company descending date var1;
run;
data want;
set have;
by company descending date;
if first.company then count=0;
if first.var1 then count=1;
else count+1;
if count>3 then var1=0;
run;
proc sort data=want;
by company date;
run;
data have;
input Company Date Var1 ;
cards;
1 1 20
1 2 20
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 5
3 5 5
3 6 5
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
;
run;
data want;
n=0;
do until(last.Var1);
set have;
by Company Var1 notsorted;
n+1;
end;
m=0 ;
do until(last.Var1);
set have;
by Company Var1 notsorted;
m+1;
if n gt 3 and m le (n-3) then do;Var1=0;output;end;
else output;
end;
drop m n;
run;
Could be as simple as:
data want;
do _n_=1 by 1 until (last.var1);
set have;
by company var1 notsorted;
if _n_>3 then
var1=0;
output;
end;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.