SAS Programming

MadQuant · Posted 05-07-2023 02:10 AM

(Note: For people who have background knowledge, I am trying to follow Section 6 of Gurkaynak, Sack, and Wright (2007). If you aren't in this field it's fine, you may ignore this note)

I am trying to get bond prices given coupons and a given yield curve that is generated on a daily interval with the maturity going up to 360 months. In other words, I am trying to get present values given cash flows and discount rates. For bonds, [cash flows = coupon payments] and [discount rate = yield curve].

The first dataset that I have is the yield curve data. Below is the sample for two days for only certain maturities. So each day there are 360 different rates (ZBR) constructing the yield curve. These are all annulized rates. Since it is too long to show in the post I attached the sample data.

data yieldcurve;
	input date maturity_m ZBR best32.;
	format date yymmddn8.;
	datalines;
20060228	1	4.016226538
20060228	2	4.0760471
20060228	8	4.324513459
20060228	14	4.33614917
20060228	20	4.294754965
20060228	206	4.623988821
20060228	358	4.483623368
20060228	359	4.482571516
20060228	360	4.481526714
20061130	1	4.020125424
20061130	2	4.081633352
20061130	3	4.140666573
20061130	4	4.194456616
20061130	358	4.482659361
20061130	359	4.48166473
20061130	360	4.480678104
;
run;

The next data I have is the bond data. Two different bonds each showing 5 different dates. Coupon rates are in percentages. But we shall assume those as dollar amounts in this context. Next coupon payment dates are shown in the last column. Again I attached the sample data.

data bond;
	input CUSIP $9. DATE MATURITY COUPON nextcoup best32.;
	format date maturity nextcoup yymmddn8.;
	datalines;
00077TAA2	20060228	20230515	7.75	20060515
00077TAA2	20061130	20230515	7.75	20070515
00077TAA2	20070630	20230515	7.75	20071115
00077TAA2	20070731	20230515	7.75	20071115
00077TAA2	20071031	20230515	7.75	20071115
00077TAB0	20060531	20931015	7.125	20061015
00077TAB0	20060831	20931015	7.125	20061015
00077TAB0	20060930	20931015	7.125	20061015
00077TAB0	20061130	20931015	7.125	20070415
00077TAB0	20070331	20931015	7.125	20070415
;
run;

So each coupon (=cash flow) is half the coupon rate since they are paid twice a year, every 6 months. From the first observation, the cash flow should be 7.75/2 = 3.875, which is going to be paid on 20060515. These are all fixed coupon bonds so the cash flow for each bond is constant every 6 months.

Now let me show what I want to do.
For the first obseravtion, the next payment date is in approximately 2 months (=20060515 - 20060228). So I would find the proper discount rate from the first table, on 20060228 with maturity = 2 months. Hence I want to discount the first cash flow, 3.875, with the discount rate, 4.0760471 by 2 months.

cf_1 = 3.875 / ((1 + 4.0760471/100)^(2/12))

The next would be

cf_2 = 3.875 / ((1 + 4.324513459/100)^(8/12))
cf_3 = 3.875 / ((1 + 4.33614917/100)^(14/12))
...
cf_35 = 3.875 / ((1+ 4.623988821/100)^(206/12))
cf_face = 100 / ((1+ 4.623988821/100)^(206/12))

Note that cf_face is the discounted face value which occurs with the last coupon payment, hence the same discount rate and time(maturity). Then,

price = sum(cf_n)

is what I want for the first observation.

Then I should move on to the next observation at date = 20061130 and repeat the same process.

I think using hash and array should be the best way, but I couldn't figure where to start.
I think loading the first data into hash table and somehow use the number of months as the index to locate the ZBR should be the key.

Any help is appreciated!

mkeintz · Posted 05-07-2023 01:53 PM

First, your date values are useless as referred to in the INPUT statements. They are just being read as eight-digit numbers.

If you input them using a date informat, it will be far easier to count months. So start out with

data yieldcurve;
  input date yymmdd8. maturity_m ZBR ;
  format date yymmddn8.;
datalines;
20060228          1     4.016226538
20060228          2     4.0760471
20060228          8     4.324513459
20060228         14     4.33614917
20060228         20     4.294754965
20060228        206     4.623988821
20060228        358     4.483623368
20060228        359     4.482571516
20060228        360     4.481526714
20061130          1     4.020125424
20061130          2     4.081633352
20061130          3     4.140666573
20061130          4     4.194456616
20061130        358     4.482659361
20061130        359     4.48166473
20061130        360     4.480678104
run;

and

data bond;
  input CUSIP $9. DATE :yymmdd8. MATURITY :yymmdd8. COUPON Nextcoup :yymmdd8.;
  format date maturity nextcoup yymmddn8.;
datalines;
00077TAA2       20060228        20230515        7.75    20060515
00077TAA2       20061130        20230515        7.75    20070515
00077TAA2       20070630        20230515        7.75    20071115
00077TAA2       20070731        20230515        7.75    20071115
00077TAA2       20071031        20230515        7.75    20071115
00077TAB0       20060531        20931015        7.125   20061015
00077TAB0       20060831        20931015        7.125   20061015
00077TAB0       20060930        20931015        7.125   20061015
00077TAB0       20061130        20931015        7.125   20070415
00077TAB0       20070331        20931015        7.125   20070415
run;

Now the idea you have to use a hash lookup (in yield_curve) based on date and maturity_m makes sense. Here's a program (untested) that does what I understand you want:


data want (drop=maturity_m zbr _:);
  set bond;
  if _n_=1 then do;
    if 0 then set yieldcurve;
    declare hash yc (dataset:'yieldcurve');
      yc.definekey('date','maturity_m');
      yc.definedata('zbr');
      yc.definedone();
  end;

  array cf {*} cf_1-cf_35 cf_face;

  /* Initial conditions */
  _cash_flow=coupon/2;    
  _months_to_first_cash_flow=intck('month',date,nextcoup,'continuous');

  do i=1 to dim(cf)-1;  /*Do all CF_ except CF_FACE*/
    _months_to_cash_flow = (i-1)*6 + _months_to_first_cash_flow;
    rc=yc.find(key:nextcoup,key:_months_to_cash_flow);
    cf{i} = _cash_flow/((1+zbr/100)**(_months_to_cash_flow/12));
  end;

  /* Now do CF_fACE */
  cf_face = 100/((1+zbr/100)**(_months_to_cash_flow/12));
  price=sum(of cf_:);

run;

BUT ... you have to know for certain that every NEXTCOUP in BOND and DATE in BOND will be found as a DATE (for nextcoup), with an acceptable maturity_m (calculated from bond date) in YIELDCURVE.

BTW, probably there is a financial function in SAS that does this, as per @ballardw 's suggestion, but I think a good deal of data reshaping would be needed.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

ballardw · Posted 05-07-2023 12:51 PM

First I will say that I do not understand any of the finance specific terms in any detail.

With that in mind, have you looked in the SAS documentation in the section related to Financial functions?

It is likely that part of what you want is available with one or more of the functions to evaluate things at specific times or rates that may simplify any other coding. And look in the FINANCE function specifically. It does 40+ calculations and a significant number of them use/return discount values.

The way most of the finance functions work is that they have n number of parameters and some will take fairly long lists. Then you set the parameter of the one of interest as missing in the function call and provide the other parameters.

MadQuant · Posted 05-07-2023 04:11 PM

I did look up and gave up on those since none seems to provide a way to input different discount rates.

But thanks anyway!

mkeintz · Posted 05-07-2023 01:53 PM

First, your date values are useless as referred to in the INPUT statements. They are just being read as eight-digit numbers.

If you input them using a date informat, it will be far easier to count months. So start out with

data yieldcurve;
  input date yymmdd8. maturity_m ZBR ;
  format date yymmddn8.;
datalines;
20060228          1     4.016226538
20060228          2     4.0760471
20060228          8     4.324513459
20060228         14     4.33614917
20060228         20     4.294754965
20060228        206     4.623988821
20060228        358     4.483623368
20060228        359     4.482571516
20060228        360     4.481526714
20061130          1     4.020125424
20061130          2     4.081633352
20061130          3     4.140666573
20061130          4     4.194456616
20061130        358     4.482659361
20061130        359     4.48166473
20061130        360     4.480678104
run;

and

data bond;
  input CUSIP $9. DATE :yymmdd8. MATURITY :yymmdd8. COUPON Nextcoup :yymmdd8.;
  format date maturity nextcoup yymmddn8.;
datalines;
00077TAA2       20060228        20230515        7.75    20060515
00077TAA2       20061130        20230515        7.75    20070515
00077TAA2       20070630        20230515        7.75    20071115
00077TAA2       20070731        20230515        7.75    20071115
00077TAA2       20071031        20230515        7.75    20071115
00077TAB0       20060531        20931015        7.125   20061015
00077TAB0       20060831        20931015        7.125   20061015
00077TAB0       20060930        20931015        7.125   20061015
00077TAB0       20061130        20931015        7.125   20070415
00077TAB0       20070331        20931015        7.125   20070415
run;

Now the idea you have to use a hash lookup (in yield_curve) based on date and maturity_m makes sense. Here's a program (untested) that does what I understand you want:


data want (drop=maturity_m zbr _:);
  set bond;
  if _n_=1 then do;
    if 0 then set yieldcurve;
    declare hash yc (dataset:'yieldcurve');
      yc.definekey('date','maturity_m');
      yc.definedata('zbr');
      yc.definedone();
  end;

  array cf {*} cf_1-cf_35 cf_face;

  /* Initial conditions */
  _cash_flow=coupon/2;    
  _months_to_first_cash_flow=intck('month',date,nextcoup,'continuous');

  do i=1 to dim(cf)-1;  /*Do all CF_ except CF_FACE*/
    _months_to_cash_flow = (i-1)*6 + _months_to_first_cash_flow;
    rc=yc.find(key:nextcoup,key:_months_to_cash_flow);
    cf{i} = _cash_flow/((1+zbr/100)**(_months_to_cash_flow/12));
  end;

  /* Now do CF_fACE */
  cf_face = 100/((1+zbr/100)**(_months_to_cash_flow/12));
  price=sum(of cf_:);

run;

BUT ... you have to know for certain that every NEXTCOUP in BOND and DATE in BOND will be found as a DATE (for nextcoup), with an acceptable maturity_m (calculated from bond date) in YIELDCURVE.

BTW, probably there is a financial function in SAS that does this, as per @ballardw 's suggestion, but I think a good deal of data reshaping would be needed.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

MadQuant · Posted 05-07-2023 06:47 PM

This is great!

First, thanks for the comment on informat, I totally overlooked that part last night. You're right.

Next, the hash code is beautiful! I think I only need a couple things to fix.

1. One of the key for lookup should be 'date', which is an easy fix even for me.

rc=yc.find(key:nextcoup,key:_months_to_cash_flow);

to

rc=yc.find(key:date,key:_months_to_cash_flow);

2. Number (or dimension) of 'cf' is not fixed. It varies for every observation. In fact, I have another variable named 'remcoups' telling me the number of remaining coupons (=cash flows). So in the example, for the first observation, that number is 35. But that changes for every observation. Below is the sample code that includes this.

data bond;
  input CUSIP $9. DATE :yymmdd8. MATURITY :yymmdd8. COUPON Nextcoup :yymmdd8. remcoups;
  format date maturity nextcoup yymmddn8.;
datalines;
00077TAA2       20060228        20230515        7.75    20060515	35
00077TAA2       20061130        20230515        7.75    20070515	33
00077TAA2       20070630        20230515        7.75    20071115	32
00077TAA2       20070731        20230515        7.75    20071115	32
00077TAA2       20071031        20230515        7.75    20071115	32
00077TAB0       20060531        20931015        7.125   20061015	175
00077TAB0       20060831        20931015        7.125   20061015	175
00077TAB0       20060930        20931015        7.125   20061015	175
00077TAB0       20061130        20931015        7.125   20070415	174
00077TAB0       20070331        20931015        7.125   20070415	174
run;

Now the problem is how do I incorporate this so that every observation goes through loops 'remcoups' times.

I think from the do loop using 'remcoups' instead of 'dim(cf)-1' should work. But how do I change the array statement accordingly?

Thanks again, much appreciated!

MadQuant · Posted 05-08-2023 01:39 PM

It seems that SAS array does not allow dynamic dimension. Hence I just chose a number that is slightly greater than the maximum possible 'remcoups' when defining the array and I think it's working great.

Thank you all for the help!

SAS Programming

Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Follow Us

What is...

SAS Programming

Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Re: Using yield curve to get bond prices (i.e., discount cash flow with time varying discount rates)

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...