Obsidian | Level 7

## calculating pairwise overlap?

Hello SAS users! I am working on mutual fund holdings data and I want to calculate overlap of stock holdings across the funds in each fund families.

Suppose Fidelity fund family runs three funds A, B, and C.

A holds stock 1,2,3,4,5,6,7

B holds 1,2,3,7,8,10

C holds 1,2,3,7,10,11,12

So,

fund A and B holds 4 overlapped stocks (stock 1,2,3, and 7)

fund A and C also holds 4 overlapped stocks (stock 1,2,3, and 7)

fund B and A holds some overlapped stocks

fund B and C holds some overlapped stocks

fund C and A holds some overlapped stocks

fund C and B holds some overlapped stocks

and so on...

In this situation,

(1) I want to generate indicator variables for overlapped stocks as shown below. (fidelity_1and2, fidelity_1and3, fidelity_2and1, fidelity_2and3,

fidelity_3and1, fidelity_3and2... same for the dreyfus fund family)

(2) After generating pairwise indicator variables I want to create the market capitalization of overlapped stock relative to toal market cap of holdings within the funds.

for example,

fund 1 and 2 (fund_id=1, 2) in Fidelity fund family(mgmt_cd=fidelity) has four overlapped stock, so I sum market capitalization of these stocks (mcap_holdings) for fund_id=1 & mgmt_cd="fidelity" and generate fidelity_mcapoverlap_1and2. same happens for other funds.

at last, I calculate average of fidelity_mcap_overlap variables so that average=mean(fidelity_mcapoverlap_1and2/mcap_total, fidelity_mcapoverlap_1and3/mcap_total) .

same procedure will be adopted for each fund families (e.g. for fidelity, dreyfus respectively).

How do I calculate the variable "average"? I only need this variable at the end of the day.( I only want to keep fund_id, mgmt_cd, average).

 fund_id mgmt_cd stock mcap_holdings fidelity_1and2 fidelity_1and3 fidelity_2and1 fidelity_2and3 fidelity_3and1 fidelity_3and2 mcap_total fidelity_mcapoverlap_1and2 fidelity_mcapoverlap_1and3 fidelity_mcapoverlap_2and1 fidelity_mcapoverlap_2and3 fidelity_mcapoverlap_3and1 fidelity_mcapoverlap_3and2 average 1 fidelity 1 2.5 1 1 . . . . 58.25 11.35 11.35 0.19485 1 fidelity 2 3.5 1 1 . . . . 58.25 11.35 11.35 0.19485 1 fidelity 3 1.6 1 1 . . . . 58.25 11.35 11.35 0.19485 1 fidelity 4 3.1 0 0 . . . . 58.25 11.35 11.35 0.19485 1 fidelity 5 2.1 0 0 . . . . 58.25 11.35 11.35 0.19485 1 fidelity 6 2 0 0 . . . . 58.25 11.35 11.35 0.19485 1 fidelity 7 3.75 1 1 . . . . 58.25 11.35 11.35 0.19485 2 fidelity 1 1.9 . . 1 1 . . 10 6.9 8.4 0.765 2 fidelity 2 1.8 . . 1 1 . . 10 6.9 8.4 0.765 2 fidelity 3 1.7 . . 1 1 . . 10 6.9 8.4 0.765 2 fidelity 7 1.5 . . 1 1 . . 10 6.9 8.4 0.765 2 fidelity 8 1.6 . . 0 0 . . 10 6.9 8.4 0.765 2 fidelity 10 1.5 0 1 . . 10 6.9 8.4 0.765 3 fidelity 1 6.1 . . . . 1 1 46 23.7 25.7 0.536957 3 fidelity 2 6.2 . . . . 1 1 46 23.7 25.7 0.536957 3 fidelity 3 6.3 . . . . 1 1 46 23.7 25.7 0.536957 3 fidelity 7 5.1 . . . . 1 1 46 23.7 25.7 0.536957 3 fidelity 10 2 . . . . 0 1 46 23.7 25.7 0.536957 3 fidelity 11 3 . . . . 0 0 46 23.7 25.7 0.536957 3 fidelity 12 1 . . . . 0 0 46 23.7 25.7 0.536957 1 dreyfus 30 3.8 1 dreyfus 31 1.56 1 dreyfus 32 64.4 1 dreyfus 33 2 1 dreyfus 34 1.9 1 dreyfus 35 1.8 2 dreyfus 33 5.6 2 dreyfus 34 5.7 2 dreyfus 35 5.1 2 dreyfus 36 5.6 2 dreyfus 37 5.2 3 dreyfus 35 3.9 3 dreyfus 36 9.8 3 dreyfus 37 8.7 3 dreyfus 38 8.4 4 dreyfus 31 3.2 4 dreyfus 32 1.2 4 dreyfus 38 22 4 dreyfus 39 13

``````data have;
input fund_id mgmt_cd \$ stock mcap_holdings;
cards;
1	fidelity	1	2.5
1	fidelity	2	3.5
1	fidelity	3	1.6
1	fidelity	4	3.1
1	fidelity	5	2.1
1	fidelity	6	2
1	fidelity	7	3.75
2	fidelity	1	1.9
2	fidelity	2	1.8
2	fidelity	3	1.7
2	fidelity	7	1.5
2	fidelity	8	1.6
2	fidelity	10	1.5
3	fidelity	1	6.1
3	fidelity	2	6.2
3	fidelity	3	6.3
3	fidelity	7	5.1
3	fidelity	10	2
3	fidelity	11	3
3	fidelity	12	1
1	dreyfus	30	3.8
1	dreyfus	31	1.56
1	dreyfus	32	64.4
1	dreyfus	33	2
1	dreyfus	34	1.9
1	dreyfus	35	1.8
2	dreyfus	33	5.6
2	dreyfus	34	5.7
2	dreyfus	35	5.1
2	dreyfus	36	5.6
2	dreyfus	37	5.2
3	dreyfus	35	3.9
3	dreyfus	36	9.8
3	dreyfus	37	8.7
3	dreyfus	38	8.4
4	dreyfus	31	3.2
4	dreyfus	32	1.2
4	dreyfus	38	22
4	dreyfus	39	13
;
run;``````

2 REPLIES 2
Tourmaline | Level 20

## Re: calculating pairwise overlap?

1.Having column names that contain values (such as FIDELITYxx) is probably a preliminary to disaster.

2.How is 46 derived?

Tourmaline | Level 20

## Re: calculating pairwise overlap?

Here is an example that derives the flags and then the sums, to get you started.

``````data HAVE (index=(A=(MGMT_CD STOCK FUND_ID)));
input FUND_ID MGMT_CD \$ STOCK MCAP_HOLDINGS;
cards;
1 fidelity 1 2.5
1 fidelity 2 3.5
1 fidelity 3 1.6
1 fidelity 4 3.1
1 fidelity 5 2.1
1 fidelity 6 2 0
1 fidelity 7 3.75
2 fidelity 1 1.9
2 fidelity 2 1.8
2 fidelity 3 1.7
2 fidelity 7 1.5
2 fidelity 8 1.6
2 fidelity 10 1.5
3 fidelity 1 6.1
3 fidelity 2 6.2
3 fidelity 3 6.3
3 fidelity 7 5.1
3 fidelity 10 2
3 fidelity 11 3
3 fidelity 12 1
run;
data CROSSINGS(index=(A=(MGMT_CD  FUND_ID STOCK)));
merge HAVE(where=(FUND_ID=1) in=F1)
HAVE(where=(FUND_ID=2) in=F2)
HAVE(where=(FUND_ID=3) in=F3)
HAVE;
by MGMT_CD STOCK;

FUND_12 = F1 * F2 * (FUND_ID=1);
FUND_13 = F1 * F3 * (FUND_ID=1);
FUND_21 = F2 * F1 * (FUND_ID=2);
FUND_23 = F2 * F3 * (FUND_ID=2);
FUND_31 = F3 * F1 * (FUND_ID=3);
FUND_32 = F3 * F2 * (FUND_ID=3);

SUM_12=FUND_12 * MCAP_HOLDINGS;
SUM_13=FUND_13 * MCAP_HOLDINGS;
SUM_21=FUND_21 * MCAP_HOLDINGS;
SUM_23=FUND_23 * MCAP_HOLDINGS;
SUM_31=FUND_31 * MCAP_HOLDINGS;
SUM_32=FUND_32 * MCAP_HOLDINGS;
run;
proc summary data=CROSSINGS nway;
class MGMT_CD   ;
var SUM: MCAP_HOLDINGS;
output out=SUMS sum=;
run;

``````
MGMT_CD _TYPE_ _FREQ_ SUM_12 SUM_13 SUM_21 SUM_23 SUM_31 SUM_32 MCAP_HOLDINGS
fidelity 1 20 11.35 11.35 6.9 8.4 23.7 25.7 58.25

Discussion stats
• 2 replies
• 1013 views
• 0 likes
• 2 in conversation