DATA Step, Macro, Functions and more

calculating pairwise overlap?

Reply
Contributor
Posts: 28

calculating pairwise overlap?

 

 

 Hello SAS users! I am working on mutual fund holdings data and I want to calculate overlap of stock holdings across the funds in each fund families. 

 

 

Suppose Fidelity fund family runs three funds A, B, and C.

 

A holds stock 1,2,3,4,5,6,7

B holds 1,2,3,7,8,10

C holds 1,2,3,7,10,11,12

 

So,

fund A and B holds 4 overlapped stocks (stock 1,2,3, and 7)

fund A and C also holds 4 overlapped stocks (stock 1,2,3, and 7)

fund B and A holds some overlapped stocks  

fund B and C holds some overlapped stocks

fund C and A holds some overlapped stocks

fund C and B holds some overlapped stocks

and so on...

 

 

In this situation, 

 

(1) I want to generate indicator variables for overlapped stocks as shown below. (fidelity_1and2, fidelity_1and3, fidelity_2and1, fidelity_2and3, 

fidelity_3and1, fidelity_3and2... same for the dreyfus fund family)

 

 

(2) After generating pairwise indicator variables I want to create the market capitalization of overlapped stock relative to toal market cap of holdings within the funds. 

 

for example, 

fund 1 and 2 (fund_id=1, 2) in Fidelity fund family(mgmt_cd=fidelity) has four overlapped stock, so I sum market capitalization of these stocks (mcap_holdings) for fund_id=1 & mgmt_cd="fidelity" and generate fidelity_mcapoverlap_1and2. same happens for other funds.

at last, I calculate average of fidelity_mcap_overlap variables so that average=mean(fidelity_mcapoverlap_1and2/mcap_total, fidelity_mcapoverlap_1and3/mcap_total) .

 

same procedure will be adopted for each fund families (e.g. for fidelity, dreyfus respectively).

 

How do I calculate the variable "average"? I only need this variable at the end of the day.( I only want to keep fund_id, mgmt_cd, average).

 

please help me!

 

 

fund_idmgmt_cdstockmcap_holdingsfidelity_1and2fidelity_1and3fidelity_2and1fidelity_2and3fidelity_3and1fidelity_3and2mcap_totalfidelity_mcapoverlap_1and2fidelity_mcapoverlap_1and3fidelity_mcapoverlap_2and1fidelity_mcapoverlap_2and3fidelity_mcapoverlap_3and1fidelity_mcapoverlap_3and2average
1fidelity12.511....58.2511.3511.35    0.19485
1fidelity23.511....58.2511.3511.35    0.19485
1fidelity31.611....58.2511.3511.35    0.19485
1fidelity43.100....58.2511.3511.35    0.19485
1fidelity52.100....58.2511.3511.35    0.19485
1fidelity6200....58.2511.3511.35    0.19485
1fidelity73.7511....58.2511.3511.35    0.19485
2fidelity11.9..11..10  6.98.4  0.765
2fidelity21.8..11..10  6.98.4  0.765
2fidelity31.7..11..10  6.98.4  0.765
2fidelity71.5..11..10  6.98.4  0.765
2fidelity81.6..00..10  6.98.4  0.765
2fidelity101.5  01..10  6.98.4  0.765
3fidelity16.1....1146    23.725.70.536957
3fidelity26.2....1146    23.725.70.536957
3fidelity36.3....1146    23.725.70.536957
3fidelity75.1....1146    23.725.70.536957
3fidelity102....0146    23.725.70.536957
3fidelity113....0046    23.725.70.536957
3fidelity121....0046    23.725.70.536957
1dreyfus303.8              
1dreyfus311.56              
1dreyfus3264.4              
1dreyfus332              
1dreyfus341.9              
1dreyfus351.8              
2dreyfus335.6              
2dreyfus345.7              
2dreyfus355.1              
2dreyfus365.6              
2dreyfus375.2              
3dreyfus353.9              
3dreyfus369.8              
3dreyfus378.7              
3dreyfus388.4              
4dreyfus313.2              
4dreyfus321.2              
4dreyfus3822              
4dreyfus3913              

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

data have;
input fund_id mgmt_cd $ stock mcap_holdings;
cards;
1	fidelity	1	2.5
1	fidelity	2	3.5
1	fidelity	3	1.6
1	fidelity	4	3.1
1	fidelity	5	2.1
1	fidelity	6	2
1	fidelity	7	3.75
2	fidelity	1	1.9
2	fidelity	2	1.8
2	fidelity	3	1.7
2	fidelity	7	1.5
2	fidelity	8	1.6
2	fidelity	10	1.5
3	fidelity	1	6.1
3	fidelity	2	6.2
3	fidelity	3	6.3
3	fidelity	7	5.1
3	fidelity	10	2
3	fidelity	11	3
3	fidelity	12	1
1	dreyfus	30	3.8
1	dreyfus	31	1.56
1	dreyfus	32	64.4
1	dreyfus	33	2
1	dreyfus	34	1.9
1	dreyfus	35	1.8
2	dreyfus	33	5.6
2	dreyfus	34	5.7
2	dreyfus	35	5.1
2	dreyfus	36	5.6
2	dreyfus	37	5.2
3	dreyfus	35	3.9
3	dreyfus	36	9.8
3	dreyfus	37	8.7
3	dreyfus	38	8.4
4	dreyfus	31	3.2
4	dreyfus	32	1.2
4	dreyfus	38	22
4	dreyfus	39	13
;
run;

 

 

 

PROC Star
Posts: 2,329

Re: calculating pairwise overlap?

1.Having column names that contain values (such as FIDELITYxx) is probably a preliminary to disaster.

2.How is 46 derived?

3.Please simplify your example a bit so calculations are more obvious to follow at a glance

 

PROC Star
Posts: 2,329

Re: calculating pairwise overlap?

[ Edited ]

Here is an example that derives the flags and then the sums, to get you started.

data HAVE (index=(A=(MGMT_CD STOCK FUND_ID)));
  input FUND_ID MGMT_CD $ STOCK MCAP_HOLDINGS;
cards;
1 fidelity 1 2.5
1 fidelity 2 3.5
1 fidelity 3 1.6
1 fidelity 4 3.1
1 fidelity 5 2.1
1 fidelity 6 2 0
1 fidelity 7 3.75
2 fidelity 1 1.9 
2 fidelity 2 1.8 
2 fidelity 3 1.7 
2 fidelity 7 1.5 
2 fidelity 8 1.6 
2 fidelity 10 1.5
3 fidelity 1 6.1 
3 fidelity 2 6.2 
3 fidelity 3 6.3 
3 fidelity 7 5.1 
3 fidelity 10 2 
3 fidelity 11 3 
3 fidelity 12 1 
run;
data CROSSINGS(index=(A=(MGMT_CD  FUND_ID STOCK)));
  merge HAVE(where=(FUND_ID=1) in=F1)
        HAVE(where=(FUND_ID=2) in=F2)
        HAVE(where=(FUND_ID=3) in=F3)
        HAVE;
 by MGMT_CD STOCK;

 FUND_12 = F1 * F2 * (FUND_ID=1);
 FUND_13 = F1 * F3 * (FUND_ID=1);
 FUND_21 = F2 * F1 * (FUND_ID=2);
 FUND_23 = F2 * F3 * (FUND_ID=2);
 FUND_31 = F3 * F1 * (FUND_ID=3);
 FUND_32 = F3 * F2 * (FUND_ID=3);

 SUM_12=FUND_12 * MCAP_HOLDINGS;
 SUM_13=FUND_13 * MCAP_HOLDINGS;
 SUM_21=FUND_21 * MCAP_HOLDINGS;
 SUM_23=FUND_23 * MCAP_HOLDINGS;
 SUM_31=FUND_31 * MCAP_HOLDINGS;
 SUM_32=FUND_32 * MCAP_HOLDINGS;
run;
proc summary data=CROSSINGS nway;
  class MGMT_CD   ;
  var SUM: MCAP_HOLDINGS;
  output out=SUMS sum=; 
run;

MGMT_CD _TYPE_ _FREQ_ SUM_12 SUM_13 SUM_21 SUM_23 SUM_31 SUM_32 MCAP_HOLDINGS
fidelity 1 20 11.35 11.35 6.9 8.4 23.7 25.7 58.25

 

 

 

Ask a Question
Discussion stats
  • 2 replies
  • 182 views
  • 0 likes
  • 2 in conversation