data have; input gvkey fyear rds; datalines; 0001 1980 0.05 0001 1981 0.02 0001 1982 0.03 0001 1983 0.04 0001 1984 0.04 0001 1985 0.05 0001 1986 0.06 ; proc sql; create table want as select a.gvkey, a.fyear, a.rds, case when count(b.rds) >= 3 then mean(b.rds) else . end as rds5y from have as a left join have as b on a.gvkey=b.gvkey and b.fyear between a.fyear-5 and a.fyear-1 group by a.gvkey, a.fyear, a.rds; select * from want; quit;
So what should be the result for the above example?
The following line should be deleted without affecting other program lines and new program line is needed to put the condition that at least continuous three rds5y(averge) should be avaiable per company(gvkey).
when count(b.rds) >= 3 then mean(b.rds)
Thank you
The above code returns table want as
gvkey fyear rds rds5y 1 1980 0.05 . 1 1981 0.02 . 1 1982 0.03 . 1 1983 0.04 0.033333 1 1984 0.04 0.035 1 1985 0.05 0.036 1 1986 0.06 0.036
what would you want instead?
The following observations should be dropped because the number of observations for rds5y is less than 3(2 obs).
gvkey fyear rds rds5y
1 1980 0.05 .
1 1981 0.02 .
1 1982 0.03 .
1 1983 0.04 .
1 1984 0.04 .
1 1985 0.05 0.036
1986 . 0.036
The following observations should remain.
gvkey fyear rds rds5y
1 1980 0.05 .
1 1981 0.02 .
1 1982 0.03 .
1 1983 0.04 .
1 1984 0.04 .
1 1985 0.05 0.036
1 1986 0.06 0.036
1 1987 0.03 0.044
Hi,
Could you please advise the program that just removes the following condition? Thank you
when count(b.rds) >= 3 then mean(b.rds)
data have;
input gvkey fyear rds;
datalines;
0001 1980 0.05
0001 1981 0.02
0001 1982 0.03
0001 1983 0.04
0001 1984 0.04
0001 1985 0.05
0001 1986 0.06
;
proc sql;
create table want as
select
a.gvkey,
a.fyear,
a.rds,
case
when count(b.rds) >= 3 then mean(b.rds)
else .
end as rds5y
from
have as a left join
have as b
on a.gvkey=b.gvkey and
b.fyear between a.fyear-5 and a.fyear-1
group by a.gvkey, a.fyear, a.rds;
select * from want;
quit;
Hi PGStats,
Could you please advise the program that just removes the following condition?
when count(b.rds) >= 3 then mean(b.rds)
I would like to have the following result.
rds5y
0001 1980 0.05 .
0001 1981 0.02 .
0001 1982 0.03 .
0001 1983 0.04 .
0001 1984 0.04 .
0001 1985 0.05 0.036
0001 1986 0.06 0.036
data have;
input gvkey fyear rds;
datalines;
0001 1980 0.05
0001 1981 0.02
0001 1982 0.03
0001 1983 0.04
0001 1984 0.04
0001 1985 0.05
0001 1986 0.06
;
proc sql;
create table want as
select
a.gvkey,
a.fyear,
a.rds,
case
when count(b.rds) >= 3 then mean(b.rds)
else .
end as rds5y
from
have as a left join
have as b
on a.gvkey=b.gvkey and
b.fyear between a.fyear-5 and a.fyear-1
group by a.gvkey, a.fyear, a.rds;
select * from want;
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.