(Note: For people who have background knowledge, I am trying to follow Section 6 of Gurkaynak, Sack, and Wright (2007). If you aren't in this field it's fine, you may ignore this note)
I am trying to get bond prices given coupons and a given yield curve that is generated on a daily interval with the maturity going up to 360 months. In other words, I am trying to get present values given cash flows and discount rates. For bonds, [cash flows = coupon payments] and [discount rate = yield curve].
The first dataset that I have is the yield curve data. Below is the sample for two days for only certain maturities. So each day there are 360 different rates (ZBR) constructing the yield curve. These are all annulized rates. Since it is too long to show in the post I attached the sample data.
data yieldcurve;
input date maturity_m ZBR best32.;
format date yymmddn8.;
datalines;
20060228 1 4.016226538
20060228 2 4.0760471
20060228 8 4.324513459
20060228 14 4.33614917
20060228 20 4.294754965
20060228 206 4.623988821
20060228 358 4.483623368
20060228 359 4.482571516
20060228 360 4.481526714
20061130 1 4.020125424
20061130 2 4.081633352
20061130 3 4.140666573
20061130 4 4.194456616
20061130 358 4.482659361
20061130 359 4.48166473
20061130 360 4.480678104
;
run;
The next data I have is the bond data. Two different bonds each showing 5 different dates. Coupon rates are in percentages. But we shall assume those as dollar amounts in this context. Next coupon payment dates are shown in the last column. Again I attached the sample data.
data bond;
input CUSIP $9. DATE MATURITY COUPON nextcoup best32.;
format date maturity nextcoup yymmddn8.;
datalines;
00077TAA2 20060228 20230515 7.75 20060515
00077TAA2 20061130 20230515 7.75 20070515
00077TAA2 20070630 20230515 7.75 20071115
00077TAA2 20070731 20230515 7.75 20071115
00077TAA2 20071031 20230515 7.75 20071115
00077TAB0 20060531 20931015 7.125 20061015
00077TAB0 20060831 20931015 7.125 20061015
00077TAB0 20060930 20931015 7.125 20061015
00077TAB0 20061130 20931015 7.125 20070415
00077TAB0 20070331 20931015 7.125 20070415
;
run;
So each coupon (=cash flow) is half the coupon rate since they are paid twice a year, every 6 months. From the first observation, the cash flow should be 7.75/2 = 3.875, which is going to be paid on 20060515. These are all fixed coupon bonds so the cash flow for each bond is constant every 6 months.
Now let me show what I want to do.
For the first obseravtion, the next payment date is in approximately 2 months (=20060515 - 20060228). So I would find the proper discount rate from the first table, on 20060228 with maturity = 2 months. Hence I want to discount the first cash flow, 3.875, with the discount rate, 4.0760471 by 2 months.
cf_1 = 3.875 / ((1 + 4.0760471/100)^(2/12))
The next would be
cf_2 = 3.875 / ((1 + 4.324513459/100)^(8/12))
cf_3 = 3.875 / ((1 + 4.33614917/100)^(14/12))
...
cf_35 = 3.875 / ((1+ 4.623988821/100)^(206/12))
cf_face = 100 / ((1+ 4.623988821/100)^(206/12))
Note that cf_face is the discounted face value which occurs with the last coupon payment, hence the same discount rate and time(maturity). Then,
price = sum(cf_n)
is what I want for the first observation.
Then I should move on to the next observation at date = 20061130 and repeat the same process.
I think using hash and array should be the best way, but I couldn't figure where to start.
I think loading the first data into hash table and somehow use the number of months as the index to locate the ZBR should be the key.
Any help is appreciated!
First, your date values are useless as referred to in the INPUT statements. They are just being read as eight-digit numbers.
If you input them using a date informat, it will be far easier to count months. So start out with
data yieldcurve;
input date yymmdd8. maturity_m ZBR ;
format date yymmddn8.;
datalines;
20060228 1 4.016226538
20060228 2 4.0760471
20060228 8 4.324513459
20060228 14 4.33614917
20060228 20 4.294754965
20060228 206 4.623988821
20060228 358 4.483623368
20060228 359 4.482571516
20060228 360 4.481526714
20061130 1 4.020125424
20061130 2 4.081633352
20061130 3 4.140666573
20061130 4 4.194456616
20061130 358 4.482659361
20061130 359 4.48166473
20061130 360 4.480678104
run;
and
data bond;
input CUSIP $9. DATE :yymmdd8. MATURITY :yymmdd8. COUPON Nextcoup :yymmdd8.;
format date maturity nextcoup yymmddn8.;
datalines;
00077TAA2 20060228 20230515 7.75 20060515
00077TAA2 20061130 20230515 7.75 20070515
00077TAA2 20070630 20230515 7.75 20071115
00077TAA2 20070731 20230515 7.75 20071115
00077TAA2 20071031 20230515 7.75 20071115
00077TAB0 20060531 20931015 7.125 20061015
00077TAB0 20060831 20931015 7.125 20061015
00077TAB0 20060930 20931015 7.125 20061015
00077TAB0 20061130 20931015 7.125 20070415
00077TAB0 20070331 20931015 7.125 20070415
run;
Now the idea you have to use a hash lookup (in yield_curve) based on date and maturity_m makes sense. Here's a program (untested) that does what I understand you want:
data want (drop=maturity_m zbr _:);
set bond;
if _n_=1 then do;
if 0 then set yieldcurve;
declare hash yc (dataset:'yieldcurve');
yc.definekey('date','maturity_m');
yc.definedata('zbr');
yc.definedone();
end;
array cf {*} cf_1-cf_35 cf_face;
/* Initial conditions */
_cash_flow=coupon/2;
_months_to_first_cash_flow=intck('month',date,nextcoup,'continuous');
do i=1 to dim(cf)-1; /*Do all CF_ except CF_FACE*/
_months_to_cash_flow = (i-1)*6 + _months_to_first_cash_flow;
rc=yc.find(key:nextcoup,key:_months_to_cash_flow);
cf{i} = _cash_flow/((1+zbr/100)**(_months_to_cash_flow/12));
end;
/* Now do CF_fACE */
cf_face = 100/((1+zbr/100)**(_months_to_cash_flow/12));
price=sum(of cf_:);
run;
BUT ... you have to know for certain that every NEXTCOUP in BOND and DATE in BOND will be found as a DATE (for nextcoup), with an acceptable maturity_m (calculated from bond date) in YIELDCURVE.
BTW, probably there is a financial function in SAS that does this, as per @ballardw 's suggestion, but I think a good deal of data reshaping would be needed.
First I will say that I do not understand any of the finance specific terms in any detail.
With that in mind, have you looked in the SAS documentation in the section related to Financial functions?
It is likely that part of what you want is available with one or more of the functions to evaluate things at specific times or rates that may simplify any other coding. And look in the FINANCE function specifically. It does 40+ calculations and a significant number of them use/return discount values.
The way most of the finance functions work is that they have n number of parameters and some will take fairly long lists. Then you set the parameter of the one of interest as missing in the function call and provide the other parameters.
I did look up and gave up on those since none seems to provide a way to input different discount rates.
But thanks anyway!
First, your date values are useless as referred to in the INPUT statements. They are just being read as eight-digit numbers.
If you input them using a date informat, it will be far easier to count months. So start out with
data yieldcurve;
input date yymmdd8. maturity_m ZBR ;
format date yymmddn8.;
datalines;
20060228 1 4.016226538
20060228 2 4.0760471
20060228 8 4.324513459
20060228 14 4.33614917
20060228 20 4.294754965
20060228 206 4.623988821
20060228 358 4.483623368
20060228 359 4.482571516
20060228 360 4.481526714
20061130 1 4.020125424
20061130 2 4.081633352
20061130 3 4.140666573
20061130 4 4.194456616
20061130 358 4.482659361
20061130 359 4.48166473
20061130 360 4.480678104
run;
and
data bond;
input CUSIP $9. DATE :yymmdd8. MATURITY :yymmdd8. COUPON Nextcoup :yymmdd8.;
format date maturity nextcoup yymmddn8.;
datalines;
00077TAA2 20060228 20230515 7.75 20060515
00077TAA2 20061130 20230515 7.75 20070515
00077TAA2 20070630 20230515 7.75 20071115
00077TAA2 20070731 20230515 7.75 20071115
00077TAA2 20071031 20230515 7.75 20071115
00077TAB0 20060531 20931015 7.125 20061015
00077TAB0 20060831 20931015 7.125 20061015
00077TAB0 20060930 20931015 7.125 20061015
00077TAB0 20061130 20931015 7.125 20070415
00077TAB0 20070331 20931015 7.125 20070415
run;
Now the idea you have to use a hash lookup (in yield_curve) based on date and maturity_m makes sense. Here's a program (untested) that does what I understand you want:
data want (drop=maturity_m zbr _:);
set bond;
if _n_=1 then do;
if 0 then set yieldcurve;
declare hash yc (dataset:'yieldcurve');
yc.definekey('date','maturity_m');
yc.definedata('zbr');
yc.definedone();
end;
array cf {*} cf_1-cf_35 cf_face;
/* Initial conditions */
_cash_flow=coupon/2;
_months_to_first_cash_flow=intck('month',date,nextcoup,'continuous');
do i=1 to dim(cf)-1; /*Do all CF_ except CF_FACE*/
_months_to_cash_flow = (i-1)*6 + _months_to_first_cash_flow;
rc=yc.find(key:nextcoup,key:_months_to_cash_flow);
cf{i} = _cash_flow/((1+zbr/100)**(_months_to_cash_flow/12));
end;
/* Now do CF_fACE */
cf_face = 100/((1+zbr/100)**(_months_to_cash_flow/12));
price=sum(of cf_:);
run;
BUT ... you have to know for certain that every NEXTCOUP in BOND and DATE in BOND will be found as a DATE (for nextcoup), with an acceptable maturity_m (calculated from bond date) in YIELDCURVE.
BTW, probably there is a financial function in SAS that does this, as per @ballardw 's suggestion, but I think a good deal of data reshaping would be needed.
This is great!
First, thanks for the comment on informat, I totally overlooked that part last night. You're right.
Next, the hash code is beautiful! I think I only need a couple things to fix.
1. One of the key for lookup should be 'date', which is an easy fix even for me.
rc=yc.find(key:nextcoup,key:_months_to_cash_flow);
to
rc=yc.find(key:date,key:_months_to_cash_flow);
2. Number (or dimension) of 'cf' is not fixed. It varies for every observation. In fact, I have another variable named 'remcoups' telling me the number of remaining coupons (=cash flows). So in the example, for the first observation, that number is 35. But that changes for every observation. Below is the sample code that includes this.
data bond;
input CUSIP $9. DATE :yymmdd8. MATURITY :yymmdd8. COUPON Nextcoup :yymmdd8. remcoups;
format date maturity nextcoup yymmddn8.;
datalines;
00077TAA2 20060228 20230515 7.75 20060515 35
00077TAA2 20061130 20230515 7.75 20070515 33
00077TAA2 20070630 20230515 7.75 20071115 32
00077TAA2 20070731 20230515 7.75 20071115 32
00077TAA2 20071031 20230515 7.75 20071115 32
00077TAB0 20060531 20931015 7.125 20061015 175
00077TAB0 20060831 20931015 7.125 20061015 175
00077TAB0 20060930 20931015 7.125 20061015 175
00077TAB0 20061130 20931015 7.125 20070415 174
00077TAB0 20070331 20931015 7.125 20070415 174
run;
Now the problem is how do I incorporate this so that every observation goes through loops 'remcoups' times.
I think from the do loop using 'remcoups' instead of 'dim(cf)-1' should work. But how do I change the array statement accordingly?
Thanks again, much appreciated!
It seems that SAS array does not allow dynamic dimension. Hence I just chose a number that is slightly greater than the maximum possible 'remcoups' when defining the array and I think it's working great.
Thank you all for the help!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.