BookmarkSubscribeRSS Feed
bryan0830
Calcite | Level 5

Dear all,

I hope you would not mind my beginner question and could help me out of this problem or I will have to manually loop it myself. T_T

My sitution is that I have 10 independent (e.g. A,B,C,D,E,F,G,H,I,J, I will call them "AtoJ") variables and 1 dependent variable, my task is to run regressions of different combination of independent variables.

The combinations are, taking away one independent variable at a time, and run 10 regression, so the first one will be "AtoJ-A", then "AtoJ-B", "AtoJ-C" and so on.

Then I will have to take two indepentdent variable at a time, so "AtoJ-AB", "AtoJ-AC" and "AtoJ-AD" and so on. (How many regressions do I have to run for that?)

I have two questions,

Firstly, how to do that as in what command I should be using?

Secondly, I want to be able to see the result of the regressions, not only the estimators, but also t-stat, F-stat(anova), test of normailty and VIF. Is that possible?

Thanks in advance for your kind support, I hope I have expressed my situation clearly.

Bryan

13 REPLIES 13
PGStats
Opal | Level 21

You could write all the models to a temporary file and then read in the code into proc reg. All the stats will be gathered in the outest= dataset

filename regCode temp;

data _null_;

length list $200;

set myData;

array vars a--j;

file regCode;

do ii = 1 to dim(vars);

    list = cats("M_",ii,":model myDep =");

    do kk = 1 to dim(vars);

        if kk ne ii then list = catx(" ", list, vname(vars{kk}));

        end;

    list = cats(list, ";");

    put list;

    do jj = ii+1 to dim(vars);

        list = cats("M_",ii,"_",jj,":model myDep =");

        do kk = 1 to dim(vars);

            if kk ne ii and kk ne jj then list = catx(" ", list, vname(vars{kk}));

            end;

        list = cats(list, ";");

        put list;

        end;

    end;

stop;

run;

proc reg data=myData plots=none tableout outest=myStats;

var a--j;

%include regCode;

run;

quit;

(not fully tested)

PG

PG
MACRO_LOOP
Obsidian | Level 7

Dear all,

 

I hope you would not mind my beginner question and could help me out of this problem. It is very similar to the issue experienced by the original poster in this thread.

 

I need to create a loop that will run multivariate regressions on a table containing daily returns data for 90 variables (30 dependent and 60 independent variables) using PROC REG. All of the variables are tickers for various equity stocks.

 

So far, I have created two tables.

1. Table  A - This contains the input data for the PROC REG. See data in Table A below.

2. Table B - This contains the exact pairings to be used in the model statement. The model statement will be dependent = independent1 independent2 .See data contained in Table B below.

 

I am a beginner in SAS. Does anyone have a macro that they could share that will loop through all of the regressions by taking the the MODEL pairings in table B and performing the regressions on table A.

 

Table A

Date _NAME_ 8306 JP Equity 8308 JP Equity 8309 JP Equity
2005-01-31 Returns
2005-02-28 Returns -0.023822981 0.014117882 0.009082715
2005-03-31 Returns -0.025479085 0.004662013 -0.036836577
2005-04-29 Returns -0.021739987 -0.082370997 -0.022771382
2005-05-31 Returns -0.005509656 0.049271049 0.023709029
2005-06-30 Returns 0.039008196 -0.004819286 0.06617729
2005-07-29 Returns 0.003183027 -0.049515065 -0.001755927
2005-08-31 Returns 0.179846746 0.180628076 0.081798634
2005-09-30 Returns 0.276558487 0.216340804 0.241277724
2005-10-31 Returns -0.027212564 0.133957923 -0.120171382

 

 

Table B

dependent independent1 independent2
8306 JP Equity 8306 JP Equity 8306 JP Equity

 

 

 

 

 

All input is truly appreciated.  I hope I have explained the scenario clearly. If I can add any additional information or clarify any of the above, please do not hesitate to ask me.

 

Thank You,

John

PGStats
Opal | Level 21

This is actually simpler:

 

data A;
input Date :yymmdd10. _NAME_ $ JP_Equity HK_Equity SA_Equity IE_Equity 
    GB_Equity HU_Equity;      
datalines;
2005-02-28 Returns -0.023822981 0.014117882 0.009082715 -0.023822981 0.014117882 0.009082715
2005-03-31 Returns -0.025479085 0.004662013 -0.036836577 -0.023822981 0.014117882 0.009082715
2005-04-29 Returns -0.021739987 -0.082370997 -0.022771382 -0.023822981 0.014117882 0.009082715
2005-05-31 Returns -0.005509656 0.049271049 0.023709029 -0.023822981 0.014117882 0.009082715
2005-06-30 Returns 0.039008196 -0.004819286 0.06617729 -0.023822981 0.014117882 0.009082715
2005-07-29 Returns 0.003183027 -0.049515065 -0.001755927 -0.023822981 0.014117882 0.009082715
2005-08-31 Returns 0.179846746 0.180628076 0.081798634 -0.023822981 0.014117882 0.009082715
2005-09-30 Returns 0.276558487 0.216340804 0.241277724 -0.023822981 0.014117882 0.009082715
2005-10-31 Returns -0.027212564 0.133957923 -0.120171382 -0.023822981 0.014117882 0.009082715
;
 
data B;
input (Dependent Independent1 Independent2) (:$32.);
datalines;
JP_Equity  HK_Equity SA_Equity
IE_Equity  GB_Equity HU_Equity
HK_Equity  SA_Equity IE_Equity
;

filename regMods temp;

data _null_;
set b;
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", Independent1, independent2, ";");
file regMods;
put line;
run;


proc reg data=A plots=none;
%include regMods;
ods output parameterestimates=pe;
run;
quit; proc print data=pe; run;
PG
MACRO_LOOP
Obsidian | Level 7

Thank You PG,

 

This is great detail.

 

Had spend quite a bit of time trying to figure this out. The explanation shown is clear and very helpful. 

 

Thank you so much for your support!

 

MACRO_LOOP
Obsidian | Level 7

Hi PG,

 

I have one follow up question on the code that you might be able to help me with

 

I have hit a small problem with the line statement (original statement provided shown below). Thus far, I have made some modifications to this line to ensure that it works for Equity tickers which are separated by spaces (modified statement shown in green below). The changes I've made work well there are 2 independent variables but I run into issues when the second independent variable is missing (See example table below). 

 

The error message will show as Cannot find variable " ".

 

I am trying to find a way to include quotation marks and "'n" when the second independent variable (independent2) is present but not include them when the second independent variable is missing (as shown in the example table below). Try as I might, I can't seem to find a solution. I have seen suggestions that CATX can be used to trim the variable name but I can't seem to get it working.

 

Is anyone aware of a solution or function that could be used to resolve this issue (i.e. Only include quotation marks when a variable is present and do not include quotation marks when the variable is missing)?

 

 

filename regMods temp;

data _null_;
set b;
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", Independent1, independent2, ";");

***Modified Statement*** line = catx(" ", cats("Mod_",_n_,": model"),cats("'",Dependent,"'n"),"=",cats("'",Independent1,"'n"),cats("'",Independent2,"'n"), "/stb;");
file regMods; put line;
run;

 

 

dependentindependent1independent2
AUST_5YC9085Y Index 
BELG_5YC9005Y Index 
DBR_5YC9105Y Index 
DENK_5YC2675Y Index 
FINL_5YC9195Y Index 
FRTR_5YC9275Y Index 
IRELND_5YC9185Y Index 
ITALY_5YC9055Y Index 
JAPAN_5YC1055Y Index 
NETHRS_5YC9205Y Index 
POLAND_5YC1195Y Index 
SLOVAK_5YC4865Y Index 
SPAIN_5YC9025Y Index 
SWED_5YC2595Y Index 
UKIN_5YC1105Y Index 
USGB_5YC0825Y Index 

 

The code that you provided has been very helpful and all advice is truly appreciated

 

Thank You

 

M

PGStats
Opal | Level 21

SAS offers plenty of tools to handle such simple tasks - the missing() and quote() functions, for example:

 


filename regMods temp;

data _null_;
set b;
if not missing(independent1) then independent1 = cats(quote(trim(independent1)),"n");
if not missing(independent2) then independent2 = cats(quote(trim(independent2)),"n");
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", independent1, independent2, ";");
file regMods;
put line;
run;
PG
MACRO_LOOP
Obsidian | Level 7

Wow! This is really helpful PG. You made that look very easy. 🙂

 

Thank you so much for taking time to provide this. It is truly appreciated.

 

M

MACRO_LOOP
Obsidian | Level 7

Hi PG,

 

The advice that you have provided has been extremely helpful. I have one final ask that you may be able to help me with to complete the project.

 

I am struggling to create the loop that will go take each of the MODEL statements (Created using the code shown in "CODE USED TO CREATE TABLE 2" below) and bring them in as the MODEL statement in the PROC REG one at a time.

 

 

I need a loop that will go through each of the line statements (found in Table 2) and include them as the MODEL statement in the PROC Reg. Thus far, I have created the two tables required for the loop (both shown below). The Proc Reg procedure is also shown below.

 

Table 1: This table contains the raw returns data on which the regressions will be performed.

Table 2: This contains the MODEL statements that will be read into the PROC REG procedure (snippet of table shown below - the number of observations in the table will change over time). The final model statement is contained in the "line" column. 

 

 

Table 1 

Date USGB_5Y WBC AU Equity WFC US Equity
2005-01-07
2005-01-14 0 -0.023406186 -0.024093628
2005-01-21 0 -0.000529353 -0.009436375
2005-01-28 0 0.006302242 0.006631324
2005-02-04 0 0.002091952 0.007572053
2005-02-11 -0.02740243 0.038412188 -0.011213838
2005-02-18 0 -0.033723848 -0.008159233
2005-02-25 0 -0.014661668 0.001503634
2005-03-04 0 0.028596229 0.00996686
2005-03-11 0.027813171 -0.0186229 -0.011637704
2005-03-18 0 0.017596769 -0.007048192
2005-03-25 0 -0.030741188 -0.016812827
2005-04-01 0 0.006332156 0.017654507
2005-04-08 0 0.013056771 0.002185058
2005-04-15 0 -0.022564768 -0.007245798
2005-04-22 -0.014439093 -0.007809887
2005-04-29 0.046300697 0.021415539

 

***CODE USED TO CREATE TABLE 2****

data _null_;
set MODEL_LIST;
if not missing(dependent) then dependent = cats(quote(trim(dependent)),"n");
if not missing(independent1) then independent1 = cats(quote(trim(independent1)),"n");
if not missing(independent2) then independent2 = cats(quote(trim(independent2)),"n");
line = catx(" ", cats("Mod_",_n_,": model"), Dependent, "=", independent1, independent2, "/stb;");
file regMods;
put line;
run;

 

 

Table 2

line
Mod_1: model "AUST_5Y"n = "MSDUAT Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_2: model "BELG_5Y"n = "MSDUBE Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_3: model "DBR_5Y"n = "MSDUGR Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_4: model "DENK_5Y"n = "MSDUDE Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_5: model "FINL_5Y"n = "MSDUFI Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_6: model "FRTR_5Y"n = "MSDUFR Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_7: model "IRELND_5Y"n = "MSDUIE Index"n "ITRXEBE CBIN Curncy"n /stb;
Mod_8: model "ITALY_5Y"n = "MSDUIT Index"n "ITRXEBE CBIN Curncy"n /stb;

 

 

/* PROC REG STATEMENT USED TO PERFORM REGRESSIONS*/

ODS Graphics on;
ods select all;
proc reg data=EOW_DATA;

%include regMods;
ods output parameterestimates=pe;
ods output FitStatistics=fs;
print;
run;quit;

 

PGStats
Opal | Level 21

You've got everything that you need, except the statement

 

filename regMods temp;

 

before the data _null_ step.

 

The data _null_ step creates the required model statements in a text file called regMods (not a table). These are then added to the proc reg step with the %include statement. Datasets pe and fs will be created by the reg procedure, containing the regression analyses results.

PG
MACRO_LOOP
Obsidian | Level 7

Hi PG,

 

Thank you for explaining this. It is very helpful.

 

The issue I face is that there is a lot of missing values in the returns table that I am using to perform the regressions on.

 

This leads to issues with the regressions as  I can see from researching online that "If any variable needed for any regression is missing, the observation is excluded from all estimates" (Full link below). The result is that there are many observation dates being excluded from the regressions.

 

https://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_reg_sect026...

 

 

Am I right in thinking that the code that I have is reading all of the Model statements from the file called regMods and feeding them into the PROC REG step (with the %include statement) together?

 

My aim is to find a solution that will feed each of the model statements (~70 in total) into the PROC REG procedure one at time, rather than as a group. This should overcome the issue that I have with the missing data.

 

Do you think it is possible to create a loop that will achieve this?

 

Thank you for taking time to read this.

 

M

PGStats
Opal | Level 21

OK, Here is another way to do multiple regressions with the full set of available observations for each model, using call execute:

 

data class;
set sashelp.class;
/* Set some weights as missing */
if mod(age,2) then call missing(weight);
run;

data model_list;
length dep ind1 ind2 $32;
input dep ind1 ind2;
datalines;
height age .
height age weight
;

%macro reg(i, dep, ind);
proc reg data=class plots=none;
ods output parameterestimates=pe_&i.;
ods output FitStatistics=fs_&i.;
mod&i.: model &dep. = &ind.;
run;
%mend reg;

data _null_;
length line $132;
set model_list;
if not missing(dep) then dep = cats(quote(trim(dep)),"n");
if not missing(ind1) then ind1 = cats(quote(trim(ind1)),"n");
if not missing(ind2) then ind2 = cats(quote(trim(ind2)),"n");
line = catx(" ", '%reg(', _n_, ",", dep, ",", ind1, ind2, ");");
call execute(line);
run;

data pe;
set pe_: ;
run;

data fs;
set fs_: ;
run;
PG
MACRO_LOOP
Obsidian | Level 7

Many thanks for taking time to look at this PG - It truly is appreciated.

 

I have made some modifications to the code this morning and it is working a treat.

 

Thank You for all of your input on this thread PG. I hope the code you have provided will be of assistance to many others who are attempting similar projects.

 

Best,

 

M

 

 

PGStats
Opal | Level 21

Best of luck to you and a great year 2018!

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 2301 views
  • 10 likes
  • 3 in conversation