Hello SAS users! I am working on mutual fund holdings data and I want to calculate overlap of stock holdings across the funds in each fund families.
Suppose Fidelity fund family runs three funds A, B, and C.
A holds stock 1,2,3,4,5,6,7
B holds 1,2,3,7,8,10
C holds 1,2,3,7,10,11,12
So,
fund A and B holds 4 overlapped stocks (stock 1,2,3, and 7)
fund A and C also holds 4 overlapped stocks (stock 1,2,3, and 7)
fund B and A holds some overlapped stocks
fund B and C holds some overlapped stocks
fund C and A holds some overlapped stocks
fund C and B holds some overlapped stocks
and so on...
In this situation,
(1) I want to generate indicator variables for overlapped stocks as shown below. (fidelity_1and2, fidelity_1and3, fidelity_2and1, fidelity_2and3,
fidelity_3and1, fidelity_3and2... same for the dreyfus fund family)
(2) After generating pairwise indicator variables I want to create the market capitalization of overlapped stock relative to toal market cap of holdings within the funds.
for example,
fund 1 and 2 (fund_id=1, 2) in Fidelity fund family(mgmt_cd=fidelity) has four overlapped stock, so I sum market capitalization of these stocks (mcap_holdings) for fund_id=1 & mgmt_cd="fidelity" and generate fidelity_mcapoverlap_1and2. same happens for other funds.
at last, I calculate average of fidelity_mcap_overlap variables so that average=mean(fidelity_mcapoverlap_1and2/mcap_total, fidelity_mcapoverlap_1and3/mcap_total) .
same procedure will be adopted for each fund families (e.g. for fidelity, dreyfus respectively).
How do I calculate the variable "average"? I only need this variable at the end of the day.( I only want to keep fund_id, mgmt_cd, average).
please help me!
fund_id | mgmt_cd | stock | mcap_holdings | fidelity_1and2 | fidelity_1and3 | fidelity_2and1 | fidelity_2and3 | fidelity_3and1 | fidelity_3and2 | mcap_total | fidelity_mcapoverlap_1and2 | fidelity_mcapoverlap_1and3 | fidelity_mcapoverlap_2and1 | fidelity_mcapoverlap_2and3 | fidelity_mcapoverlap_3and1 | fidelity_mcapoverlap_3and2 | average |
1 | fidelity | 1 | 2.5 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 2 | 3.5 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 3 | 1.6 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 4 | 3.1 | 0 | 0 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 5 | 2.1 | 0 | 0 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 6 | 2 | 0 | 0 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
1 | fidelity | 7 | 3.75 | 1 | 1 | . | . | . | . | 58.25 | 11.35 | 11.35 | 0.19485 | ||||
2 | fidelity | 1 | 1.9 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 2 | 1.8 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 3 | 1.7 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 7 | 1.5 | . | . | 1 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 8 | 1.6 | . | . | 0 | 0 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||
2 | fidelity | 10 | 1.5 | 0 | 1 | . | . | 10 | 6.9 | 8.4 | 0.765 | ||||||
3 | fidelity | 1 | 6.1 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 2 | 6.2 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 3 | 6.3 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 7 | 5.1 | . | . | . | . | 1 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 10 | 2 | . | . | . | . | 0 | 1 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 11 | 3 | . | . | . | . | 0 | 0 | 46 | 23.7 | 25.7 | 0.536957 | ||||
3 | fidelity | 12 | 1 | . | . | . | . | 0 | 0 | 46 | 23.7 | 25.7 | 0.536957 | ||||
1 | dreyfus | 30 | 3.8 | ||||||||||||||
1 | dreyfus | 31 | 1.56 | ||||||||||||||
1 | dreyfus | 32 | 64.4 | ||||||||||||||
1 | dreyfus | 33 | 2 | ||||||||||||||
1 | dreyfus | 34 | 1.9 | ||||||||||||||
1 | dreyfus | 35 | 1.8 | ||||||||||||||
2 | dreyfus | 33 | 5.6 | ||||||||||||||
2 | dreyfus | 34 | 5.7 | ||||||||||||||
2 | dreyfus | 35 | 5.1 | ||||||||||||||
2 | dreyfus | 36 | 5.6 | ||||||||||||||
2 | dreyfus | 37 | 5.2 | ||||||||||||||
3 | dreyfus | 35 | 3.9 | ||||||||||||||
3 | dreyfus | 36 | 9.8 | ||||||||||||||
3 | dreyfus | 37 | 8.7 | ||||||||||||||
3 | dreyfus | 38 | 8.4 | ||||||||||||||
4 | dreyfus | 31 | 3.2 | ||||||||||||||
4 | dreyfus | 32 | 1.2 | ||||||||||||||
4 | dreyfus | 38 | 22 | ||||||||||||||
4 | dreyfus | 39 | 13 |
data have;
input fund_id mgmt_cd $ stock mcap_holdings;
cards;
1 fidelity 1 2.5
1 fidelity 2 3.5
1 fidelity 3 1.6
1 fidelity 4 3.1
1 fidelity 5 2.1
1 fidelity 6 2
1 fidelity 7 3.75
2 fidelity 1 1.9
2 fidelity 2 1.8
2 fidelity 3 1.7
2 fidelity 7 1.5
2 fidelity 8 1.6
2 fidelity 10 1.5
3 fidelity 1 6.1
3 fidelity 2 6.2
3 fidelity 3 6.3
3 fidelity 7 5.1
3 fidelity 10 2
3 fidelity 11 3
3 fidelity 12 1
1 dreyfus 30 3.8
1 dreyfus 31 1.56
1 dreyfus 32 64.4
1 dreyfus 33 2
1 dreyfus 34 1.9
1 dreyfus 35 1.8
2 dreyfus 33 5.6
2 dreyfus 34 5.7
2 dreyfus 35 5.1
2 dreyfus 36 5.6
2 dreyfus 37 5.2
3 dreyfus 35 3.9
3 dreyfus 36 9.8
3 dreyfus 37 8.7
3 dreyfus 38 8.4
4 dreyfus 31 3.2
4 dreyfus 32 1.2
4 dreyfus 38 22
4 dreyfus 39 13
;
run;
1.Having column names that contain values (such as FIDELITYxx) is probably a preliminary to disaster.
2.How is 46 derived?
3.Please simplify your example a bit so calculations are more obvious to follow at a glance
Here is an example that derives the flags and then the sums, to get you started.
data HAVE (index=(A=(MGMT_CD STOCK FUND_ID)));
input FUND_ID MGMT_CD $ STOCK MCAP_HOLDINGS;
cards;
1 fidelity 1 2.5
1 fidelity 2 3.5
1 fidelity 3 1.6
1 fidelity 4 3.1
1 fidelity 5 2.1
1 fidelity 6 2 0
1 fidelity 7 3.75
2 fidelity 1 1.9
2 fidelity 2 1.8
2 fidelity 3 1.7
2 fidelity 7 1.5
2 fidelity 8 1.6
2 fidelity 10 1.5
3 fidelity 1 6.1
3 fidelity 2 6.2
3 fidelity 3 6.3
3 fidelity 7 5.1
3 fidelity 10 2
3 fidelity 11 3
3 fidelity 12 1
run;
data CROSSINGS(index=(A=(MGMT_CD FUND_ID STOCK)));
merge HAVE(where=(FUND_ID=1) in=F1)
HAVE(where=(FUND_ID=2) in=F2)
HAVE(where=(FUND_ID=3) in=F3)
HAVE;
by MGMT_CD STOCK;
FUND_12 = F1 * F2 * (FUND_ID=1);
FUND_13 = F1 * F3 * (FUND_ID=1);
FUND_21 = F2 * F1 * (FUND_ID=2);
FUND_23 = F2 * F3 * (FUND_ID=2);
FUND_31 = F3 * F1 * (FUND_ID=3);
FUND_32 = F3 * F2 * (FUND_ID=3);
SUM_12=FUND_12 * MCAP_HOLDINGS;
SUM_13=FUND_13 * MCAP_HOLDINGS;
SUM_21=FUND_21 * MCAP_HOLDINGS;
SUM_23=FUND_23 * MCAP_HOLDINGS;
SUM_31=FUND_31 * MCAP_HOLDINGS;
SUM_32=FUND_32 * MCAP_HOLDINGS;
run;
proc summary data=CROSSINGS nway;
class MGMT_CD ;
var SUM: MCAP_HOLDINGS;
output out=SUMS sum=;
run;
MGMT_CD | _TYPE_ | _FREQ_ | SUM_12 | SUM_13 | SUM_21 | SUM_23 | SUM_31 | SUM_32 | MCAP_HOLDINGS |
---|---|---|---|---|---|---|---|---|---|
fidelity | 1 | 20 | 11.35 | 11.35 | 6.9 | 8.4 | 23.7 | 25.7 | 58.25 |
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.