Solved
Contributor
Posts: 57

# Average by Excluding One Observation at a Time

Dear Community Members,

I have a dataset that looks like the following:

DATA return;

INPUT portid firmid myear return;

DATALINES;

1 100 200301  2.11

1 120 200301  3.12

1 130 200301  2.13

1 140 200301  3.67

1 100 200302  5.23

1 120 200302  6.81

1 130 200302  2.55

1 140 200302  4.21

1 150 200302  3.21

2 300 200301  1.11

2 320 200301  0.12

2 330 200301 -2.13

2 340 200301 -1.67

2 300 200302  4.23

2 320 200302  3.81

2 330 200302  1.55

2 340 200302  2.21

2 350 200302  4.23

;

run;

My objective is to calculate average returns for each portid and myear (i.e. average "return" in each portid-myear pair). Next, match each firmid in a given portid and myear with  the corresponding average returns. Normally, I can use "proc means" by "portid myear", but there is a restriction

such that a company's return should be excluded from the corresponding average return. For example,

for portid=1, myear=200301, firmid=100, the corresponding average return=(3.12+2.13+3.67)/3=2.9733333

for portid=1, myear=200301, firmid=120, the corresponding average return=(2.11+2.13+3.67)/3=2.6366666

for portid=2, myear=200302, firmid=350, the corresponding average return=(4.23+3.81+1.55+2.21)/4=2.95

I would really appreciate your help with this problem.

Accepted Solutions
Solution
‎09-30-2012 05:42 PM
PROC Star
Posts: 8,165

## Re: Average by Excluding One Observation at a Time

Probably many ways to achieve what you want.  E.g.:

proc transpose data=return out=need (drop=_name_);

var return;

id firmid;

by portid myear;

run;

data need;

merge return need;

by portid myear;

run;

data want (drop=i _;

set need;

array returns(*) _100--_350;

do i=1 to dim(returns);

if input(substr(vname(returns(i)),2),8.) eq firmid then

call missing(returns(i));

end;

average=mean(of returns(*));

run;

All Replies
Solution
‎09-30-2012 05:42 PM
PROC Star
Posts: 8,165

## Re: Average by Excluding One Observation at a Time

Probably many ways to achieve what you want.  E.g.:

proc transpose data=return out=need (drop=_name_);

var return;

id firmid;

by portid myear;

run;

data need;

merge return need;

by portid myear;

run;

data want (drop=i _;

set need;

array returns(*) _100--_350;

do i=1 to dim(returns);

if input(substr(vname(returns(i)),2),8.) eq firmid then

call missing(returns(i));

end;

average=mean(of returns(*));

run;

Posts: 5,540

## Re: Average by Excluding One Observation at a Time

There is a simple SQL solution :

proc sql;

create table want as

select a.*, mean(b.return) as meanEx

from return as a inner join return as b

on a.portId = b.portId and a.myear=b.myear and b.firmId ne a.firmId

group by a.portId, a.mYear, a.firmId, a.return;

select * from want;

quit;

PG

PG
Super Contributor
Posts: 1,636

## Re: Average by Excluding One Observation at a Time

Hi PG,

I don't understand your code. How did you get the average meanEX?  Thanks!

Posts: 5,540

## Re: Average by Excluding One Observation at a Time

Hi Linlin,

OK, I retract the words simple solution. For each obs in RETURN, take all obs in RETURN with the same portId and mYear, except the one with the same formId. Take the average for each obs.

I guess the best way to understand is to look at the result from

proc sql;

select a.*, b.return as otherReturn

from return as a inner join return as b

on a.portId = b.portId and a.myear=b.myear and b.firmId ne a.firmId;

PG

PG
Super Contributor
Posts: 1,636

Thank you PG!

Contributor
Posts: 57

## Re: Average by Excluding One Observation at a Time

Thank you so much for your help and clarifying question. These two solutions are really brilliant and practical.

Best,

SK

🔒 This topic is solved and locked.