Dear All,
Could you please help me with the following problem? I have dataset that looks like this:
Company Date Var1
1 1 20
1 2 20
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 5
3 5 5
3 6 5
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
I need to obtain the following dataset:
Company Date Var1
1 1 0
1 2 0
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 0
3 5 0
3 6 0
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
In other words, for each company I need to replace with zero those values that are repeated more than three times. I want to leave these three repeated values and remaining repeated values replace with zero. I also think that values that are repeated more than three times are usually placed at the beginning of each company.
Any help would be hugely appreciated.
data have;
input Company Date Var1 ;
cards;
1 1 20
1 2 20
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 5
3 5 5
3 6 5
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
;
run;
data want;
n=0;
do until(last.Var1);
set have;
by Company Var1 notsorted;
n+1;
end;
m=0 ;
do until(last.Var1);
set have;
by Company Var1 notsorted;
m+1;
if n gt 3 and m le (n-3) then do;Var1=0;output;end;
else output;
end;
drop m n;
run;
Does it matter if it's the latest or earliest that are replaced?
proc sort data=have;
by company descending date var1;
run;
data want;
set have;
by company descending date;
if first.company then count=0;
if first.var1 then count=1;
else count+1;
if count>3 then var1=0;
run;
proc sort data=want;
by company date;
run;
data have;
input Company Date Var1 ;
cards;
1 1 20
1 2 20
1 3 20
1 4 20
1 5 20
1 6 35
1 7 35
1 8 35
1 9 28
1 10 28
1 11 28
1 12 56
1 13 56
1 14 56
2 1 20
2 2 20
2 3 20
2 4 18
2 5 18
2 6 18
2 7 27
3 4 5
3 5 5
3 6 5
3 7 5
3 8 5
3 9 5
3 10 9
3 11 9
3 12 9
;
run;
data want;
n=0;
do until(last.Var1);
set have;
by Company Var1 notsorted;
n+1;
end;
m=0 ;
do until(last.Var1);
set have;
by Company Var1 notsorted;
m+1;
if n gt 3 and m le (n-3) then do;Var1=0;output;end;
else output;
end;
drop m n;
run;
Could be as simple as:
data want;
do _n_=1 by 1 until (last.var1);
set have;
by company var1 notsorted;
if _n_>3 then
var1=0;
output;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.