Dear Community Members,
I have a dataset that looks like the following:
DATA return;
INPUT portid firmid myear return;
DATALINES;
1 100 200301 2.11
1 120 200301 3.12
1 130 200301 2.13
1 140 200301 3.67
1 100 200302 5.23
1 120 200302 6.81
1 130 200302 2.55
1 140 200302 4.21
1 150 200302 3.21
2 300 200301 1.11
2 320 200301 0.12
2 330 200301 -2.13
2 340 200301 -1.67
2 300 200302 4.23
2 320 200302 3.81
2 330 200302 1.55
2 340 200302 2.21
2 350 200302 4.23
;
run;
My objective is to calculate average returns for each portid and myear (i.e. average "return" in each portid-myear pair). Next, match each firmid in a given portid and myear with the corresponding average returns. Normally, I can use "proc means" by "portid myear", but there is a restriction
such that a company's return should be excluded from the corresponding average return. For example,
for portid=1, myear=200301, firmid=100, the corresponding average return=(3.12+2.13+3.67)/3=2.9733333
for portid=1, myear=200301, firmid=120, the corresponding average return=(2.11+2.13+3.67)/3=2.6366666
for portid=2, myear=200302, firmid=350, the corresponding average return=(4.23+3.81+1.55+2.21)/4=2.95
I would really appreciate your help with this problem.
Probably many ways to achieve what you want. E.g.:
proc transpose data=return out=need (drop=_name_);
var return;
id firmid;
by portid myear;
run;
data need;
merge return need;
by portid myear;
run;
data want (drop=i _:);
set need;
array returns(*) _100--_350;
do i=1 to dim(returns);
if input(substr(vname(returns(i)),2),8.) eq firmid then
call missing(returns(i));
end;
average=mean(of returns(*));
run;
Probably many ways to achieve what you want. E.g.:
proc transpose data=return out=need (drop=_name_);
var return;
id firmid;
by portid myear;
run;
data need;
merge return need;
by portid myear;
run;
data want (drop=i _:);
set need;
array returns(*) _100--_350;
do i=1 to dim(returns);
if input(substr(vname(returns(i)),2),8.) eq firmid then
call missing(returns(i));
end;
average=mean(of returns(*));
run;
There is a simple SQL solution :
proc sql;
create table want as
select a.*, mean(b.return) as meanEx
from return as a inner join return as b
on a.portId = b.portId and a.myear=b.myear and b.firmId ne a.firmId
group by a.portId, a.mYear, a.firmId, a.return;
select * from want;
quit;
PG
Hi PG,
I don't understand your code. How did you get the average meanEX? Thanks!
Hi Linlin,
OK, I retract the words simple solution. For each obs in RETURN, take all obs in RETURN with the same portId and mYear, except the one with the same formId. Take the average for each obs.
I guess the best way to understand is to look at the result from
proc sql;
select a.*, b.return as otherReturn
from return as a inner join return as b
on a.portId = b.portId and a.myear=b.myear and b.firmId ne a.firmId;
PG
Thank you PG!
Thank you so much for your help and clarifying question. These two solutions are really brilliant and practical.
Best,
SK
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.