Not applicable
Posts: 1

# How to loop regressions?

Dear all,

I hope you would not mind my beginner question and could help me out of this problem or I will have to manually loop it myself. T_T

My sitution is that I have 10 independent (e.g. A,B,C,D,E,F,G,H,I,J, I will call them "AtoJ") variables and 1 dependent variable, my task is to run regressions of different combination of independent variables.

The combinations are, taking away one independent variable at a time, and run 10 regression, so the first one will be "AtoJ-A", then "AtoJ-B", "AtoJ-C" and so on.

Then I will have to take two indepentdent variable at a time, so "AtoJ-AB", "AtoJ-AC" and "AtoJ-AD" and so on. (How many regressions do I have to run for that?)

I have two questions,

Firstly, how to do that as in what command I should be using?

Secondly, I want to be able to see the result of the regressions, not only the estimators, but also t-stat, F-stat(anova), test of normailty and VIF. Is that possible?

Thanks in advance for your kind support, I hope I have expressed my situation clearly.

Bryan

Posts: 5,622

## Re: How to loop regressions?

You could write all the models to a temporary file and then read in the code into proc reg. All the stats will be gathered in the outest= dataset

filename regCode temp;

data _null_;

length list \$200;

set myData;

array vars a--j;

file regCode;

do ii = 1 to dim(vars);

list = cats("M_",ii,":model myDep =");

do kk = 1 to dim(vars);

if kk ne ii then list = catx(" ", list, vname(vars{kk}));

end;

list = cats(list, ";");

put list;

do jj = ii+1 to dim(vars);

list = cats("M_",ii,"_",jj,":model myDep =");

do kk = 1 to dim(vars);

if kk ne ii and kk ne jj then list = catx(" ", list, vname(vars{kk}));

end;

list = cats(list, ";");

put list;

end;

end;

stop;

run;

proc reg data=myData plots=none tableout outest=myStats;

var a--j;

%include regCode;

run;

quit;

(not fully tested)

PG

PG
Contributor
Posts: 27

## Re: How to loop regressions?

Dear all,

I hope you would not mind my beginner question and could help me out of this problem. It is very similar to the issue experienced by the original poster in this thread.

I need to create a loop that will run multivariate regressions on a table containing daily returns data for 90 variables (30 dependent and 60 independent variables) using PROC REG. All of the variables are tickers for various equity stocks.

So far, I have created two tables.

1. Table  A - This contains the input data for the PROC REG. See data in Table A below.

2. Table B - This contains the exact pairings to be used in the model statement. The model statement will be dependent = independent1 independent2 .See data contained in Table B below.

I am a beginner in SAS. Does anyone have a macro that they could share that will loop through all of the regressions by taking the the MODEL pairings in table B and performing the regressions on table A.

Table A

Date _NAME_ 8306 JP Equity 8308 JP Equity 8309 JP Equity
2005-01-31 Returns
2005-02-28 Returns -0.023822981 0.014117882 0.009082715
2005-03-31 Returns -0.025479085 0.004662013 -0.036836577
2005-04-29 Returns -0.021739987 -0.082370997 -0.022771382
2005-05-31 Returns -0.005509656 0.049271049 0.023709029
2005-06-30 Returns 0.039008196 -0.004819286 0.06617729
2005-07-29 Returns 0.003183027 -0.049515065 -0.001755927
2005-08-31 Returns 0.179846746 0.180628076 0.081798634
2005-09-30 Returns 0.276558487 0.216340804 0.241277724
2005-10-31 Returns -0.027212564 0.133957923 -0.120171382

Table B

dependent independent1 independent2
8306 JP Equity 8306 JP Equity 8306 JP Equity

All input is truly appreciated.  I hope I have explained the scenario clearly. If I can add any additional information or clarify any of the above, please do not hesitate to ask me.

Thank You,

John

Posts: 5,622

## Re: How to loop regressions?

This is actually simpler:

``````data A;
input Date :yymmdd10. _NAME_ \$ JP_Equity HK_Equity SA_Equity IE_Equity
GB_Equity HU_Equity;
datalines;
2005-02-28 Returns -0.023822981 0.014117882 0.009082715 -0.023822981 0.014117882 0.009082715
2005-03-31 Returns -0.025479085 0.004662013 -0.036836577 -0.023822981 0.014117882 0.009082715
2005-04-29 Returns -0.021739987 -0.082370997 -0.022771382 -0.023822981 0.014117882 0.009082715
2005-05-31 Returns -0.005509656 0.049271049 0.023709029 -0.023822981 0.014117882 0.009082715
2005-06-30 Returns 0.039008196 -0.004819286 0.06617729 -0.023822981 0.014117882 0.009082715
2005-07-29 Returns 0.003183027 -0.049515065 -0.001755927 -0.023822981 0.014117882 0.009082715
2005-08-31 Returns 0.179846746 0.180628076 0.081798634 -0.023822981 0.014117882 0.009082715
2005-09-30 Returns 0.276558487 0.216340804 0.241277724 -0.023822981 0.014117882 0.009082715
2005-10-31 Returns -0.027212564 0.133957923 -0.120171382 -0.023822981 0.014117882 0.009082715
;

data B;
input (Dependent Independent1 Independent2) (:\$32.);
datalines;
JP_Equity  HK_Equity SA_Equity
IE_Equity  GB_Equity HU_Equity
HK_Equity  SA_Equity IE_Equity
;

filename regMods temp;

data _null_;
set b;
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", Independent1, independent2, ";");
file regMods;
put line;
run;

proc reg data=A plots=none;
%include regMods;
ods output parameterestimates=pe;
run;quit;

proc print data=pe; run;``````
PG
Contributor
Posts: 27

## Re: How to loop regressions?

Thank You PG,

This is great detail.

Had spend quite a bit of time trying to figure this out. The explanation shown is clear and very helpful.

Thank you so much for your support!

Contributor
Posts: 27

## Re: How to loop regressions?

[ Edited ]

Hi PG,

I have one follow up question on the code that you might be able to help me with

I have hit a small problem with the line statement (original statement provided shown below). Thus far, I have made some modifications to this line to ensure that it works for Equity tickers which are separated by spaces (modified statement shown in green below). The changes I've made work well there are 2 independent variables but I run into issues when the second independent variable is missing (See example table below).

The error message will show as Cannot find variable " ".

I am trying to find a way to include quotation marks and "'n" when the second independent variable (independent2) is present but not include them when the second independent variable is missing (as shown in the example table below). Try as I might, I can't seem to find a solution. I have seen suggestions that CATX can be used to trim the variable name but I can't seem to get it working.

Is anyone aware of a solution or function that could be used to resolve this issue (i.e. Only include quotation marks when a variable is present and do not include quotation marks when the variable is missing)?

``````filename regMods temp;

data _null_;
set b;
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", Independent1, independent2, ";");***Modified Statement*** line = catx(" ", cats("Mod_",_n_,": model"),cats("'",Dependent,"'n"),"=",cats("'",Independent1,"'n"),cats("'",Independent2,"'n"), "/stb;");
file regMods;
put line;
run;
``````

 dependent independent1 independent2 AUST_5Y C9085Y Index BELG_5Y C9005Y Index DBR_5Y C9105Y Index DENK_5Y C2675Y Index FINL_5Y C9195Y Index FRTR_5Y C9275Y Index IRELND_5Y C9185Y Index ITALY_5Y C9055Y Index JAPAN_5Y C1055Y Index NETHRS_5Y C9205Y Index POLAND_5Y C1195Y Index SLOVAK_5Y C4865Y Index SPAIN_5Y C9025Y Index SWED_5Y C2595Y Index UKIN_5Y C1105Y Index USGB_5Y C0825Y Index

The code that you provided has been very helpful and all advice is truly appreciated

Thank You

M

Posts: 5,622

## Re: How to loop regressions?

SAS offers plenty of tools to handle such simple tasks - the missing() and quote() functions, for example:

``````
filename regMods temp;

data _null_;
set b;
if not missing(independent1) then independent1 = cats(quote(trim(independent1)),"n");
if not missing(independent2) then independent2 = cats(quote(trim(independent2)),"n");
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", independent1, independent2, ";");
file regMods;
put line;
run;``````
PG
Contributor
Posts: 27

## Re: How to loop regressions?

Wow! This is really helpful PG. You made that look very easy. :-)

Thank you so much for taking time to provide this. It is truly appreciated.

M

Contributor
Posts: 27

## Re: How to loop regressions?

[ Edited ]

Hi PG,

The advice that you have provided has been extremely helpful. I have one final ask that you may be able to help me with to complete the project.

I am struggling to create the loop that will go take each of the MODEL statements (Created using the code shown in "CODE USED TO CREATE TABLE 2" below) and bring them in as the MODEL statement in the PROC REG one at a time.

I need a loop that will go through each of the line statements (found in Table 2) and include them as the MODEL statement in the PROC Reg. Thus far, I have created the two tables required for the loop (both shown below). The Proc Reg procedure is also shown below.

Table 1: This table contains the raw returns data on which the regressions will be performed.

Table 2: This contains the MODEL statements that will be read into the PROC REG procedure (snippet of table shown below - the number of observations in the table will change over time). The final model statement is contained in the "line" column.

Table 1

Date USGB_5Y WBC AU Equity WFC US Equity
2005-01-07
2005-01-14 0 -0.023406186 -0.024093628
2005-01-21 0 -0.000529353 -0.009436375
2005-01-28 0 0.006302242 0.006631324
2005-02-04 0 0.002091952 0.007572053
2005-02-11 -0.02740243 0.038412188 -0.011213838
2005-02-18 0 -0.033723848 -0.008159233
2005-02-25 0 -0.014661668 0.001503634
2005-03-04 0 0.028596229 0.00996686
2005-03-11 0.027813171 -0.0186229 -0.011637704
2005-03-18 0 0.017596769 -0.007048192
2005-03-25 0 -0.030741188 -0.016812827
2005-04-01 0 0.006332156 0.017654507
2005-04-08 0 0.013056771 0.002185058
2005-04-15 0 -0.022564768 -0.007245798
2005-04-22 -0.014439093 -0.007809887
2005-04-29 0.046300697 0.021415539

***CODE USED TO CREATE TABLE 2****

data _null_;
set MODEL_LIST;
if not missing(dependent) then dependent = cats(quote(trim(dependent)),"n");
if not missing(independent1) then independent1 = cats(quote(trim(independent1)),"n");
if not missing(independent2) then independent2 = cats(quote(trim(independent2)),"n");
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", independent1, independent2, "/stb;");
file regMods;
put line;
run;

Table 2

line
Mod_1: model "AUST_5Y"n = "MSDUAT Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_2: model "BELG_5Y"n = "MSDUBE Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_3: model "DBR_5Y"n = "MSDUGR Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_4: model "DENK_5Y"n = "MSDUDE Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_5: model "FINL_5Y"n = "MSDUFI Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_6: model "FRTR_5Y"n = "MSDUFR Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_7: model "IRELND_5Y"n = "MSDUIE Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_8: model "ITALY_5Y"n = "MSDUIT Index"n "ITRXEBE CBIN Curncy"n /stb;

/* PROC REG STATEMENT USED TO PERFORM REGRESSIONS*/

ODS Graphics on;
ods select all;
proc reg data=EOW_DATA;

%include regMods;
ods output parameterestimates=pe;
ods output FitStatistics=fs;
print;
run;quit;

Posts: 5,622

## Re: How to loop regressions?

You've got everything that you need, except the statement

filename regMods temp;

before the data _null_ step.

The data _null_ step creates the required model statements in a text file called regMods (not a table). These are then added to the proc reg step with the %include statement. Datasets pe and fs will be created by the reg procedure, containing the regression analyses results.

PG
Contributor
Posts: 27

## Re: How to loop regressions?

[ Edited ]

Hi PG,

Thank you for explaining this. It is very helpful.

The issue I face is that there is a lot of missing values in the returns table that I am using to perform the regressions on.

This leads to issues with the regressions as  I can see from researching online that "If any variable needed for any regression is missing, the observation is excluded from all estimates" (Full link below). The result is that there are many observation dates being excluded from the regressions.

https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_reg_sect026...

Am I right in thinking that the code that I have is reading all of the Model statements from the file called regMods and feeding them into the PROC REG step (with the %include statement) together?

My aim is to find a solution that will feed each of the model statements (~70 in total) into the PROC REG procedure one at time, rather than as a group. This should overcome the issue that I have with the missing data.

Do you think it is possible to create a loop that will achieve this?

Thank you for taking time to read this.

M

Posts: 5,622

## Re: How to loop regressions?

OK, Here is another way to do multiple regressions with the full set of available observations for each model, using call execute:

``````data class;
set sashelp.class;
/* Set some weights as missing */
if mod(age,2) then call missing(weight);
run;

data model_list;
length dep ind1 ind2 \$32;
input dep ind1 ind2;
datalines;
height age .
height age weight
;

%macro reg(i, dep, ind);
proc reg data=class plots=none;
ods output parameterestimates=pe_&i.;
ods output FitStatistics=fs_&i.;
mod&i.: model &dep. = &ind.;
run;
%mend reg;

data _null_;
length line \$132;
set model_list;
if not missing(dep) then dep = cats(quote(trim(dep)),"n");
if not missing(ind1) then ind1 = cats(quote(trim(ind1)),"n");
if not missing(ind2) then ind2 = cats(quote(trim(ind2)),"n");
line = catx(" ", '%reg(', _n_, ",", dep, ",", ind1, ind2, ");");
call execute(line);
run;

data pe;
set pe_: ;
run;

data fs;
set fs_: ;
run;``````
PG
Contributor
Posts: 27

## Re: How to loop regressions?

Many thanks for taking time to look at this PG - It truly is appreciated.

I have made some modifications to the code this morning and it is working a treat.

Thank You for all of your input on this thread PG. I hope the code you have provided will be of assistance to many others who are attempting similar projects.

Best,

M

Posts: 5,622