data want;
do until(last.year);
set firm;
by sic2 year;
if not first.year then do;
count+1;
sum+return;end;end;
avg=sum/count;
proc print;run;
i was just tryin to test my dow skills here.Newly created variables should always be set to missing before the 1st iteration of each by group in DOWs ,right?
but that s not happening for me
You used sum statement the rules are different. RTM
The following approach is largely copied from 's solution, only to use Hash hoping to avoid Array dimensions preset. It is supposedly more robust and supposedly overcome your memory limitation.
data firm;
input firm:$4. SIC:$4. Year Return;
length sic2 $3;
sic2 = sic;
cards;
1001 3100 1990 0.01
1001 3100 1991 0.2
1001 3100 1992 0.3
1001 3100 1993 0.7
1001 3100 1994 -0.5
1001 3100 1995 0.2
1001 3100 1996 0.4
1002 3101 1990 0.7
1002 3101 1991 -0.5
1002 3101 1992 0.2
1002 3101 1993 0.4
1002 3101 1994 0.4
1002 3101 1995 0.2
1002 3101 1996 0.2
1003 3101 1990 0.2
1003 3101 1991 0.4
1003 3101 1992 0.7
1003 3101 1993 -0.5
1003 3101 1994 0.2
1003 3101 1995 0.2
1003 3101 1996 0.2
1006 3300 1990 -0.5
1006 3300 1991 0.9
1006 3300 1992 0.2
1006 3300 1993 0.09
1006 3300 1994 0.09
1006 3300 1995 0.22
1006 3300 1996 0.1
1004 3303 1990 0.4
1004 3303 1991 0.2
1004 3303 1992 0.2
1004 3303 1993 -0.5
1004 3303 1994 0.9
1004 3303 1995 0.22
1004 3303 1996 0.1
;
run;
proc sort;
by sic2 year firm;
run;
data mean;
if _n_=1 then do;
if 0 then set firm (rename=(return=_r firm=_f));
declare hash h(multidata:'y');
h.definekey('sic2','year');
h.definedata('_r', '_f');
h.definedone();
end;
do until (last.year);
set firm;
by sic2 year;
_rc=h.add(key:sic2, key:year, data:return, data:firm);
end;
do until (last.year);
set firm;
by sic2 year;
do _rc=h.find() by 0 while (_rc=0);
if firm ne _f then do; _den+1;_sum+_r;end;
_rc=h.find_next();
end;
mean=ifn(_den<=0,.,_sum/_den);
output;
call missing(of _:);
end;
_rc=h.clear();
drop _:;
run;
Haikuo
data have;
input firm:$4. SIC:$4. Year Return firm_size;
length sic2 $3;
sic2 = sic;
cards;
1001 3100 1990 0.01 100
1001 3100 1991 0.2 200
1001 3100 1992 0.3 150
1001 3100 1993 0.7 120
1001 3100 1994 -0.5 130
1001 3100 1995 0.2 140
1001 3100 1996 0.4 200
1002 3101 1990 0.7 200
1002 3101 1991 -0.5 300
1002 3101 1992 0.2 300
1002 3101 1993 0.4 400
1002 3101 1994 0.4 400
1002 3101 1995 0.2 500
1002 3101 1996 0.2 600
1003 3101 1990 0.2 100
1003 3101 1991 0.4 200
1003 3101 1992 0.7 150
1003 3101 1993 -0.5 120
1003 3101 1994 0.2 130
1003 3101 1995 0.2 140
1003 3101 1996 0.2 200
1006 3300 1990 -0.5 200
1006 3300 1991 0.9 300
1006 3300 1992 0.2 300
1006 3300 1993 0.09 400
1006 3300 1994 0.09 400
1006 3300 1995 0.22 500
1006 3300 1996 0.1 600
1004 3303 1990 0.4 100
1004 3303 1991 0.2 200
1004 3303 1992 0.2 150
1004 3303 1993 -0.5 120
1004 3303 1994 0.9 130
1004 3303 1995 0.22 140
1004 3303 1996 0.1 200
;
run;
proc sql;
create table temp as
select *,firm_size/(select sum(firm_size) from have where sic2=a.sic2 and year=a.year ) as weight
from have as a;
create table want as
select *,(((select sum(weight*Return) from have where sic2=a.sic2 and year=a.year )-a.Return*a.weight)/(select sum(weight) from have where sic2=a.sic2 and year=a.year )) as mean_ret
from temp as a;
quit;
Xia Keshan
>Here is sub-query version code, but not as efficient as Reeza's :
data have; input firm_id SIC Year Return; id=substr(put(sic,4.),1,3); cards; 1001 3100 1990 0.01 1001 3100 1991 0.2 1001 3100 1992 0.3 1001 3100 1993 0.7 1001 3100 1994 -0.5 1001 3100 1995 0.2 1001 3100 1996 0.4 1002 3101 1990 0.7 1002 3101 1991 -0.5 1002 3101 1992 0.2 1002 3101 1993 0.4 1002 3101 1994 0.4 1002 3101 1995 0.2 1002 3101 1996 0.2 1003 3101 1990 0.2 1003 3101 1991 0.4 1003 3101 1992 0.7 1003 3101 1993 -0.5 1003 3101 1994 0.2 1003 3101 1995 0.2 1003 3101 1996 0.2 1006 3300 1990 -0.5 1006 3300 1991 0.9 1006 3300 1992 0.2 1006 3300 1993 0.09 1006 3300 1994 0.09 1006 3300 1995 0.22 1006 3300 1996 0.1 1004 3303 1990 0.4 1004 3303 1991 0.2 1004 3303 1992 0.2 1004 3303 1993 -0.5 1004 3303 1994 0.9 1004 3303 1995 0.22 1004 3303 1996 0.1 ; run; proc sql; create table want as select *,(((select sum(Return) from have where Year=a.Year and id=a.id )-a.Return)/((select count(*) from have where Year=a.Year and id=a.id )-1)) as mean_ret from have as a; quit;
Xia Keshan
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.