Hi,
I think my problem is quite easy for someone who knows PROC IML, so I believe this is the right discussion forum. If someone could suggest a solution, I would be very appreciative. I am familiar with base SAS and its database programming aspects, but I am new to PROC IML and matrix/ vector multiplication.
My problem is this: Given the returns, variances, and weights of n individual assets, I want to combine these assets into a single weighted-average portfolio and then calculate the variance and return of the portfolio. The problem is I have thousands of portfolios to create and each portfolio has a different number of assets, many with only 1 or 2 assets, others with 10 assets. The portfolio variance calculation changes depending on n. Using normal base SAS and linear combinations makes the problem complicated, but using matrix/vector multiplication makes the solution to the problem very easy and flexible since:
E(Rp) = w' * R where E(Rp) = expected return on the portfolio; R is the column vector of asset returns; w is the col vector of weights and w' is the transpose of this vector
Var(Rp) = w' * ( V * V' ) * w where V(Rp) = the variance of the portfolio; V is col vector of asset variances and V' is the transpose.
(NOTE: I am assuming a correlation coeffecient of 1 between the returns on all asset)
My data to construct these vectors comes in the following form: I have three BY variables (dealnum, cusip6, date), I know the number of assets in each portfolio (wt_N), I am given the weight of each asset in the portfolio (wt -- the weights of the portfolio add up to 1), I have variance for the same asset (var10) and the return for this asset (ret10). So referring to the data below, Porfolio 1 is consists of Obs 1, 2; Portfolio 2 is obs 3 only; Portfolio 3 is of obs 4 to 17; Portfolio 4 is of obs 18, 19; Porfolio 5 is of obs 20; and Porfolio 6 is of obs 21, 22, etc (up to n=4000 portfolios). For each of the these portfolios the vectors w, R and V are different, but using matrix/vector multiplication should make this problem REALLY EASY!! :smileygrin:
Could someone show me how this could be done in PROC IML (or otherwise)? I attach a more complete sample of data in case you want it.
obs | dealnum | cusip6 | date | wt_N | wt | var10 | ret10 |
1 | 1673534045 | 459200 | 20/07/2005 | 2 | 0.3942807 | 6.3719E-05 | 0.001942508 |
2 | 1673534045 | 459200 | 20/07/2005 | 2 | 0.6057193 | 0.000168092 | 0.002905392 |
3 | 1673534045 | 459200 | 26/07/2005 | 1 | 1 | 6.3719E-05 | 0.000542947 |
4 | 1674759045 | 345370 | 15/06/2005 | 7 | 0.0176526 | 0.000496162 | -0.01220406 |
5 | 1674759045 | 345370 | 15/06/2005 | 7 | 0.046326 | 0.000384426 | 0.001794011 |
6 | 1674759045 | 345370 | 15/06/2005 | 7 | 0.0830054 | 0.000516729 | -0.03776804 |
7 | 1674759045 | 345370 | 15/06/2005 | 7 | 0.1176475 | 0.000297142 | -0.0457237 |
8 | 1674759045 | 345370 | 15/06/2005 | 7 | 0.1203526 | 0.000277292 | -0.02096441 |
9 | 1674759045 | 345370 | 15/06/2005 | 7 | 0.2062127 | 0.000157504 | -0.01184258 |
10 | 1674759045 | 345370 | 15/06/2005 | 7 | 0.408803 | 7.55415E-05 | -0.01552909 |
11 | 1674759045 | 345370 | 16/06/2005 | 7 | 0.0174636 | 0.000496162 | -0.02186809 |
12 | 1674759045 | 345370 | 16/06/2005 | 7 | 0.0454444 | 0.000384426 | -0.02943547 |
13 | 1674759045 | 345370 | 16/06/2005 | 7 | 0.0855512 | 0.000516729 | 0.018480605 |
14 | 1674759045 | 345370 | 16/06/2005 | 7 | 0.1204331 | 0.000297142 | 0.011245789 |
15 | 1674759045 | 345370 | 16/06/2005 | 7 | 0.1210426 | 0.000277292 | -0.006116 |
16 | 1674759045 | 345370 | 16/06/2005 | 7 | 0.2062787 | 0.000157504 | -0.01120836 |
17 | 1674759045 | 345370 | 16/06/2005 | 7 | 0.4037864 | 7.55415E-05 | -0.02312268 |
18 | 1675595045 | 459200 | 18/05/2005 | 2 | 0.3983457 | 5.96785E-05 | -0.00159973 |
19 | 1675595045 | 459200 | 18/05/2005 | 2 | 0.6016543 | 0.000176524 | -0.00643742 |
20 | 1675595045 | 459200 | 19/05/2005 | 1 | 1 | 5.96785E-05 | -0.00917604 |
21 | 1675595045 | 459200 | 23/05/2005 | 2 | 0.3993792 | 5.96785E-05 | 0.018426452 |
22 | 1675595045 | 459200 | 23/05/2005 | 2 | 0.6006208 | 0.000176524 | 0.006719993 |
Many thanks in advance,
Ian.
Here are a few tips.
1) Read about the UNIQUE-LOC technique, to get the main ideas of how to do something like this The UNIQUE-LOC trick: A real treat! - The DO Loop
2) Because you have three BY-group variables, I recommend implementing the efficiency scheme here: An efficient alternative to the UNIQUE-LOC technique - The DO Loop
3) If the data are too large to fit into memory, use a variation on this approach BY-group processing in SAS/IML - The DO Loop
4) Develop and debug on a small set of sample data. When it is working, scale up to the real data.
Hi Rick,
Thanks very much for your very helpful tips. They helped me focus my learning of PROC IML on my concrete problem, which has now been resolved thanks to the tips you gave. Reducing my three BY variables to just one by pre-sorting the data set before PROC IML helped resolve many problems.
Thanks again! I really appreciate your help.
Regards
Ian.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.