Hello SAS users! I am working on mutual fund holdings data and I want to calculate overlap of stock holdings across the funds in each fund families.
Suppose Fidelity fund family runs three funds A, B, and C.
A holds stock 1,2,3,4,5,6,7
B holds 1,2,3,7,8,10
C holds 1,2,3,7,10,11,12
So,
fund A and B holds 4 overlapped stocks (stock 1,2,3, and 7)
fund A and C also holds 4 overlapped stocks (stock 1,2,3, and 7)
fund B and A holds some overlapped stocks
fund B and C holds some overlapped stocks
fund C and A holds some overlapped stocks
fund C and B holds some overlapped stocks
and so on...
In this situation,
(1) I want to generate indicator variables for overlapped stocks as shown below. (fidelity_1and2, fidelity_1and3, fidelity_2and1, fidelity_2and3,
fidelity_3and1, fidelity_3and2... same for the dreyfus fund family)
(2) After generating pairwise indicator variables I want to create the market capitalization of overlapped stock relative to toal market cap of holdings within the funds.
for example,
fund 1 and 2 (fund_id=1, 2) in Fidelity fund family(mgmt_cd=fidelity) has four overlapped stock, so I sum market capitalization of these stocks (mcap_holdings) for fund_id=1 & mgmt_cd="fidelity" and generate fidelity_mcapoverlap_1and2. same happens for other funds.
at last, I calculate average of fidelity_mcap_overlap variables so that average=mean(fidelity_mcapoverlap_1and2/mcap_total, fidelity_mcapoverlap_1and3/mcap_total) .
same procedure will be adopted for each fund families (e.g. for fidelity, dreyfus respectively).
How do I calculate the variable "average"? I only need this variable at the end of the day.( I only want to keep fund_id, mgmt_cd, average).
please help me!
fund_id | mgmt_cd | stock | mcap_holdings | fidelity_1and2 | fidelity_1and3 | fidelity_2and1 | fidelity_2and3 | fidelity_3and1 | fidelity_3and2 | mcap_total | fidelity_mcapoverlap_1and2 | fidelity_mcapoverlap_1and3 | fidelity_mcapoverlap_2and1 | fidelity_mcapoverlap_2and3 | fidelity_mcapoverlap_3and1 | fidelity_mcapoverlap_3and2 | average |
1 | fidelity | 1 | 2.5 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 2 | 3.5 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 3 | 1.6 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 4 | 3.1 | 0 | 0 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 5 | 2.1 | 0 | 0 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 6 | 2 | 0 | 0 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 7 | 3.75 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
2 | fidelity | 1 | 1.9 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 2 | 1.8 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 3 | 1.7 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 7 | 1.5 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 8 | 1.6 | . | . | 0 | 0 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 10 | 1.5 | 0 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||||
3 | fidelity | 1 | 6.1 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 2 | 6.2 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 3 | 6.3 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 7 | 5.1 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 10 | 2 | . | . | . | . | 0 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 11 | 3 | . | . | . | . | 0 | 0 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 12 | 1 | . | . | . | . | 0 | 0 | 46 | 23.7 | 25.7 | 0.536957 | ||||
1 | dreyfus | 30 | 3.8 | ||||||||||||||
1 | dreyfus | 31 | 1.56 | ||||||||||||||
1 | dreyfus | 32 | 64.4 | ||||||||||||||
1 | dreyfus | 33 | 2 | ||||||||||||||
1 | dreyfus | 34 | 1.9 | ||||||||||||||
1 | dreyfus | 35 | 1.8 | ||||||||||||||
2 | dreyfus | 33 | 5.6 | ||||||||||||||
2 | dreyfus | 34 | 5.7 | ||||||||||||||
2 | dreyfus | 35 | 5.1 | ||||||||||||||
2 | dreyfus | 36 | 5.6 | ||||||||||||||
2 | dreyfus | 37 | 5.2 | ||||||||||||||
3 | dreyfus | 35 | 3.9 | ||||||||||||||
3 | dreyfus | 36 | 9.8 | ||||||||||||||
3 | dreyfus | 37 | 8.7 | ||||||||||||||
3 | dreyfus | 38 | 8.4 | ||||||||||||||
4 | dreyfus | 31 | 3.2 | ||||||||||||||
4 | dreyfus | 32 | 1.2 | ||||||||||||||
4 | dreyfus | 38 | 22 | ||||||||||||||
4 | dreyfus | 39 | 13 |
data have;
input fund_id mgmt_cd $ stock mcap_holdings;
cards;
1 fidelity 1 2.5
1 fidelity 2 3.5
1 fidelity 3 1.6
1 fidelity 4 3.1
1 fidelity 5 2.1
1 fidelity 6 2
1 fidelity 7 3.75
2 fidelity 1 1.9
2 fidelity 2 1.8
2 fidelity 3 1.7
2 fidelity 7 1.5
2 fidelity 8 1.6
2 fidelity 10 1.5
3 fidelity 1 6.1
3 fidelity 2 6.2
3 fidelity 3 6.3
3 fidelity 7 5.1
3 fidelity 10 2
3 fidelity 11 3
3 fidelity 12 1
1 dreyfus 30 3.8
1 dreyfus 31 1.56
1 dreyfus 32 64.4
1 dreyfus 33 2
1 dreyfus 34 1.9
1 dreyfus 35 1.8
2 dreyfus 33 5.6
2 dreyfus 34 5.7
2 dreyfus 35 5.1
2 dreyfus 36 5.6
2 dreyfus 37 5.2
3 dreyfus 35 3.9
3 dreyfus 36 9.8
3 dreyfus 37 8.7
3 dreyfus 38 8.4
4 dreyfus 31 3.2
4 dreyfus 32 1.2
4 dreyfus 38 22
4 dreyfus 39 13
;
run;
1.Having column names that contain values (such as FIDELITYxx) is probably a preliminary to disaster.
2.How is 46 derived?
3.Please simplify your example a bit so calculations are more obvious to follow at a glance
Here is an example that derives the flags and then the sums, to get you started.
data HAVE (index=(A=(MGMT_CD STOCK FUND_ID)));
input FUND_ID MGMT_CD $ STOCK MCAP_HOLDINGS;
cards;
1 fidelity 1 2.5
1 fidelity 2 3.5
1 fidelity 3 1.6
1 fidelity 4 3.1
1 fidelity 5 2.1
1 fidelity 6 2 0
1 fidelity 7 3.75
2 fidelity 1 1.9
2 fidelity 2 1.8
2 fidelity 3 1.7
2 fidelity 7 1.5
2 fidelity 8 1.6
2 fidelity 10 1.5
3 fidelity 1 6.1
3 fidelity 2 6.2
3 fidelity 3 6.3
3 fidelity 7 5.1
3 fidelity 10 2
3 fidelity 11 3
3 fidelity 12 1
run;
data CROSSINGS(index=(A=(MGMT_CD FUND_ID STOCK)));
merge HAVE(where=(FUND_ID=1) in=F1)
HAVE(where=(FUND_ID=2) in=F2)
HAVE(where=(FUND_ID=3) in=F3)
HAVE;
by MGMT_CD STOCK;
FUND_12 = F1 * F2 * (FUND_ID=1);
FUND_13 = F1 * F3 * (FUND_ID=1);
FUND_21 = F2 * F1 * (FUND_ID=2);
FUND_23 = F2 * F3 * (FUND_ID=2);
FUND_31 = F3 * F1 * (FUND_ID=3);
FUND_32 = F3 * F2 * (FUND_ID=3);
SUM_12=FUND_12 * MCAP_HOLDINGS;
SUM_13=FUND_13 * MCAP_HOLDINGS;
SUM_21=FUND_21 * MCAP_HOLDINGS;
SUM_23=FUND_23 * MCAP_HOLDINGS;
SUM_31=FUND_31 * MCAP_HOLDINGS;
SUM_32=FUND_32 * MCAP_HOLDINGS;
run;
proc summary data=CROSSINGS nway;
class MGMT_CD ;
var SUM: MCAP_HOLDINGS;
output out=SUMS sum=;
run;
MGMT_CD | _TYPE_ | _FREQ_ | SUM_12 | SUM_13 | SUM_21 | SUM_23 | SUM_31 | SUM_32 | MCAP_HOLDINGS |
---|---|---|---|---|---|---|---|---|---|
fidelity | 1 | 20 | 11.35 | 11.35 | 6.9 | 8.4 | 23.7 | 25.7 | 58.25 |
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.