BookmarkSubscribeRSS Feed
Fluorite | Level 6

Hi Everyone. I am quite stuck. I inherited a house price index model which is founded on the case schiller methodology and uses a proc iml/matrix function to calculate the betas of the model.


Problem is, we have tried to work through the code but cannot make sense of why the matrix approach was applied.


If any, is there something I can use to replace this function with? (I'm taking a shot in the dark here). Also, a lambda of five is used in the HP filter




proc iml;

Reading the data into a iml matrix
Use price3;
/*price = j(2000000,482,0);*/ /*(number of transactions, number of date points,fill with zeros) = creating the size of the matrix price3 */
read all into price;


Creating the X and Y matrix

Y is a matrix containing all the data for the
first month of the entire time period (bases time step)
and its size one by (number of transactions)

X is a matrix containing the data point from the second
time period to the last time period of all the transactions,
and its size is (time steps -1) by (amount of transactions)

X = price[,2:ncol(price)];
Y = Price[,1];

In order to compute beta according to the Case Shiller
method You have to change the first price in each row
of the X matrix to a negative value and keep the second
price the same, except if the transactions first price
falls within the bases time period thus in the Y matrix
then the first and only price in the X matrix row stays

do i = 1 to nrow(x);
one = 0;
if y[i,1] > 0 then
do j = 1 to ncol(x);
x[i,j] = x[i,j];
do j = 1 to ncol(x);
if one = 0
then x[i,j] = x[i,j]*-1 ;
else x[i,j] = x[i,j];
one = x[i,j]+one;

Z is basically the X matrix except that the price is
changed to one thus where there is a negative price in
X there is -1 in Z and where there is a positive price
in X there is a 1 in Z else the rest stays zero.

Z = j(nrow(x),ncol(x),1);
do i = 1 to nrow(x);
do j = 1 to ncol(x);
if x[i,j] = 0 then z[i,j] = 0;
if x[i,j] > 0 then z[i,j] = 1;
if x[i,j] < 0 then z[i,j] = -1;

Now we compute Beta with the Case Shiller method using
the matrix Y, X and Z

B_inv_est = inv(Z`*X)*(Z`*Y);
B_est = 1/B_inv_est;

The following steps computes the weighted Beta were
transaction further apart is weighted less than transaction
close to each other, but this calculation uses a lot of space
and thus cannot be computed on my sas server.

/*q = Y-(X*B_inv_est);
w = j(nrow(q),nrow(q),0);
do i = 1 to nrow(q);
do j = 1 to nrow(q);
if i = j then w[i,j] = w[i,j]+q[i];
else w[i,j]= 0 ;
B_inv_weight = inv((X`)*(W**2)*X)*((X`)*(W**2)*Y);
B_weight = 1/B_inv_weight;*/

print B_inv_est B_est /*B_inv_weight B_weight*/;

create B_est ; /** create data set **/
append; /** write data in vectors **/
close B_est; /** close the data set **/


data B_est;
set work.B_EST;
where B_EST > 0.0000001;
N = _N_;

proc expand data=B_est out=B_est_HP_T5 method=none;
id N;
/* by B_est;*/
convert B_est = HP_B_est/ transformout=(HP_T 5);

Super User

It looks like a Regression Model. Better post it at IML forum .

Anyway, calling @Rick_SAS   ,since it is about IML. 

Fluorite | Level 6
thanks will do

To clarify, you want someone to explain what the program does? Is your main goal to understand it or to replace this program by calling a different SAS procedure?

Fluorite | Level 6
Im trying to find a way of making the analytics behind this matrix easier to interpret. If there is another SAS procedure, I would gladly like to try it.

@Tzar wrote:
Im trying to find a way of making the analytics behind this matrix easier to interpret. If there is another SAS procedure, I would gladly like to try it.

The short answer is that I do not know whether there is an alternative way to perform this analysis in SAS. Maybe or maybe not. It would presumably be part of some time series or financial risk computation, and I am not very knowledgable about the SAS offerings in those areas. However, people typically use IML when they want a statistical analysis that is NOT available in another procedure.


The analysis reads in a design matrix, X. Then modifies it according to the method that is explained in the comments. The modified design matrix is Z. Whereas the usual OLS estimates are the solution  (b) to the normal equations (X`X)b = X`y, this formulation requests a solution to the modified equation (Z`X)b = Z`y.  That is, instead of using X` to project y into the column space of X, this analysis uses Z` to project b. 


If you can find a procedure that performs that computation, then the remainder of the program smooths the regression coefficients by using PROC EXPAND to apply a Hodrick-Prescott Filter trend component with 5 as the filter parameter.  



Barite | Level 11

If you wish to keep the IML code, then the commented out part which 'uses a lot of space', could be made a lot more efficient. There is no need to keep the weights in a diagonal matrix, as it could be rewritten to use a vector of weights instead. This would save a lot of unnecessary multiplication too.


Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.


Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 6 replies
  • 4 in conversation