BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ho0421
Fluorite | Level 6

Hi,

 

Thanks to Wicklin's codes on his blog, I have learned how to write a code that can calculate the moving average by group. The following is the code that computes the moving average of horsepower for each "Make" using the SAS built-in dataset "Cars". However, I got in stuck in how to revise this code so I can do the rolling regression by each "Make", for example, regression of "Weight" on "Wheelbase" and "Length", including the intercept (using 10-rolling windows for each "Make"). I know some normal SAS macro that can do the work. But I want to learn how SAS/IML can achieve this.

 

Any help would be highly appreciated.

 


libname sample "E:\Data";

proc iml;

start MA(y, k);
   MA = j(nrow(y), 1, .);
   do i = 1 to nrow(y);
      idx = max(1,(i-k+1)):i;   /* rolling window of data points */
      MA[i] = mean( y[idx] );   /* compute average */
   end;
   return (MA);
finish MA;


use sample.cars;                   /* input data */
read all var {"Make"};              /* read BY var */
byGroup = unique( Make );           /* compute unique levels */
ByVal = BlankStr(nleng(Make));      /* set length of ByVal variable (12.3) */
OutVarNames = {"RollingHorsePower"};
MA = {.};                 /* output variables are numeric */
create RegOut from MA[r=ByVal c=OutVarNames];


setin sample.Cars;                 /* make current for reading */
setout RegOut;         /* make current for writing */

 

InVarNames = {"Horsepower"};        /* input variables */
index Make;                           /* index by BY group variable */
do i = 1 to ncol(byGroup);
   ByVal = byGroup[i];
   read all var InVarNames where(Make=(ByVal));  
      MA = MA(horsepower, 2);
   append from MA[r=ByVal];                
end;
close sample.Cars RegOut;
quit;

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

While developing a program, you should always build it in pieces, starting with the simple and then adding complexities after the simple computations are working.  In this case, let's start with computing the rolling regression for a single firm.  You can add the loop over firms later.

 

If you have p explanatory variables and an intercept term, a linear regression requires at least k >= p+2 observations. 

 

Based on the code you posted, I think the following code will give you some direction. If it is what you want, you can add the BY-group loop. 

 

proc iml;
/* Assume no missing values and X`*X is nonsingular.
   Y is k x 1 vector of responses
   X is k x p matrix of explanatory variables 
     Send in X[,1]=1 for intercept models
   Return 1 x p vector of parameter estimates.
*/
start RegEst(Y, X);
   xpxi = ginv(X`*X);       /* generalized inverse of X'X */
   b = xpxi * (X`*Y);       /* parameter estimates */
   return b`;               /* return as row vector */
finish;

/* Assume X and Y as for RegEst.
   k specifies the length of the moving window.
*/
start OLSParamEst(y, x, k);
   p = ncol(x);
   PE = j(nrow(y), p, .);  /* parameter estimates */
   do i = k to nrow(y);
      s = i-k+1;          /* starting obs: 1, 2, ... */
      idx = s:i;          /* rolling window of data points */
      PE[i,] = RegEst( y[idx], X[idx,] );
   end;
   return (PE);
finish;

use sashelp.stocks where(stock="IBM"); /* input data */
read all var "Close" into Y;
read all var {"Open" "Volume"};
close;

/* add intercept term. Measure volumn in millions */
X = j(nrow(Y), 1, 1) || Open || (Volume/1e6); 
PE = OLSParamEst(y, X, 6);  /* k >= 4 */

OutVarNames = {"b_Intercept" "b_Open" "b_Volume"};
create RegOut from PE[c=OutVarNames];
append from PE;
close RegOut;
QUIT;

/* test the program by plotting predictions for data */
data Fit;
merge sashelp.stocks(where=(stock="IBM"))
      RegOut;
Pred = b_Intercept + b_Open*Open + b_Volume*Volume/1e6;
run;

proc sgplot data=Fit;
scatter x=Date y=Close;
series x=Date y=Pred;
run;

View solution in original post

4 REPLIES 4
Rick_SAS
SAS Super FREQ

What information do you want the regression module to return? The parameter estimates? For example, do you want the Intercept and estimates of coefficients for Wheelbase and Length? Or do you want the predicted values at the end of each time window?

ho0421
Fluorite | Level 6
 
********************************************
* IML Rolling Regression By Group.sas     **
* This code calculates Betas by "Cusip"   **
*******************************************;

libname sample "E:\Data";

proc iml;

start OLSParamEst(y,x, k);
  
   Estimate= j(nrow(y),2,.);
   do i = 1 to nrow(y);
      idx = max(1,(i-k+1)):i;   /* rolling window of data points */
      Estimate[i] = ginv(x[idx]`*x[idx])*x[idx]`*y[idx];
   end;
   return (Estimate);
finish;

use sample.threefirms;                    /* input data */
read all var {"Cusip"};              /* read BY var */
byGroup = unique( Cusip );           /* compute unique levels */
ByVal = BlankStr(nleng(Cusip));      /* set sprtrn of ByVal variable (12.3) */
OutVarNames = {"ret" "sprtrn" "shrout"}; 
Estimate = {..};                 /* output variables are numeric */
create RegOut from Estimate[r=ByVal c=OutVarNames];


setin sample.threefirms;                  /* Cusip current for reading */
setout RegOut;    					/* Cusip current for writing */

InVarNames = {"ret" "sprtrn" "shrout"};      /* input variables */
index Cusip;                         /* index by BY group variable */
do i = 1 to ncol(byGroup);
   ByVal = byGroup[i];
   read all var InVarNames where(Cusip=(ByVal));   
       Estimate = OLSParamEst(ret,sprtrn||shrout,36);
   append from Estimate[r=ByVal];                 
end;
close sample.threefirms RegOut;
quit;

Many thanks for your response to my inquiry, Rick. In fact, I am trying to learn how SAS/IML can run rolling window regressions with the similar results offered using PROC REG. I have never used SAS/IML before. After reading your book "Statistical Programming with SAS/IML Software", I found SAS/IML a power tool that is worth spending time/effort to learn.

The code shown above tries to run rolling regression of monthly stock returns on the S&P 500 index returns. It seems to work but needs to revise:

1. The "ByVal" cusip only displays for the first estimate of each firm. How can I revise to code to display for each row of the results?

2. How to display the "Date" of the end of each time window for each row?

3. How to revise the code in order to not show the invalid estimates "."?

 

In sum, I want to report all the results using SAS/IML similar to those offered by PROC REG

 

Thanks.

Rick_SAS
SAS Super FREQ

While developing a program, you should always build it in pieces, starting with the simple and then adding complexities after the simple computations are working.  In this case, let's start with computing the rolling regression for a single firm.  You can add the loop over firms later.

 

If you have p explanatory variables and an intercept term, a linear regression requires at least k >= p+2 observations. 

 

Based on the code you posted, I think the following code will give you some direction. If it is what you want, you can add the BY-group loop. 

 

proc iml;
/* Assume no missing values and X`*X is nonsingular.
   Y is k x 1 vector of responses
   X is k x p matrix of explanatory variables 
     Send in X[,1]=1 for intercept models
   Return 1 x p vector of parameter estimates.
*/
start RegEst(Y, X);
   xpxi = ginv(X`*X);       /* generalized inverse of X'X */
   b = xpxi * (X`*Y);       /* parameter estimates */
   return b`;               /* return as row vector */
finish;

/* Assume X and Y as for RegEst.
   k specifies the length of the moving window.
*/
start OLSParamEst(y, x, k);
   p = ncol(x);
   PE = j(nrow(y), p, .);  /* parameter estimates */
   do i = k to nrow(y);
      s = i-k+1;          /* starting obs: 1, 2, ... */
      idx = s:i;          /* rolling window of data points */
      PE[i,] = RegEst( y[idx], X[idx,] );
   end;
   return (PE);
finish;

use sashelp.stocks where(stock="IBM"); /* input data */
read all var "Close" into Y;
read all var {"Open" "Volume"};
close;

/* add intercept term. Measure volumn in millions */
X = j(nrow(Y), 1, 1) || Open || (Volume/1e6); 
PE = OLSParamEst(y, X, 6);  /* k >= 4 */

OutVarNames = {"b_Intercept" "b_Open" "b_Volume"};
create RegOut from PE[c=OutVarNames];
append from PE;
close RegOut;
QUIT;

/* test the program by plotting predictions for data */
data Fit;
merge sashelp.stocks(where=(stock="IBM"))
      RegOut;
Pred = b_Intercept + b_Open*Open + b_Volume*Volume/1e6;
run;

proc sgplot data=Fit;
scatter x=Date y=Close;
series x=Date y=Pred;
run;

ho0421
Fluorite | Level 6

Hi Rick,

 

I am so grateful for your prompt response to my post. The code you provided is so clear and make much more sense to me in terms of its structure and enhances my understanding of how SAS/IML works. I will try adding the loop by group feature to the code you suggested and see how it goes.

 

By the way, I just noted that the dataset SASHELP.STOCKS needs to be sorted by "Date" for each stock before the proc iml procedure can be implemented so the results obtained are for 'historical' rather than "forward-looking" rolling windows.

 

Thank you very much again.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

From The DO Loop
Want more? Visit our blog for more articles like these.
Discussion stats
  • 4 replies
  • 1456 views
  • 2 likes
  • 2 in conversation