Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- SAS: Proc TransReg

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-10-2018 09:01 AM
(1712 views)

I am working with a dataset which contains 24 dependent variables and 27 independent variables, the constraint is that I am to trial each dependent variable with 4 independent variables. I've been trying to figure out Proc Transreg on how to do this iteratively but I'm not having much luck so far.

the dataset looks like

```
data have;
input Year Ind1--Ind24 Dep1--Dep27;
datalines;
Year Ind1 ind2 ind3 ind4 ... dep1 dep2 dep3 ...
2014Q1 105 210 305 405 ... 10 20 40 30
2014Q2 10 20 30 40 ... 40 5 2 1
2014Q3 15 210 305 405 ... 10 20 40 30
2014Q4 10 20 05 405 ... 10 20 40 30
2015Q1 105 10 35 45 ... 30 20 40 50
2015Q2 105 21 3 405 ... 10 20 40 30
2015Q3 15 21 5 40 ... 50 20 70 30
2015Q4 5 210 35 5 ... 10 90 40 30
;
run;
```

the regressions I'm looking to trial would look like this:

```
Ind1 = Dep1 + Dep2 + Dep3 + Dep4
Ind1 = Dep1 + Dep3 + Dep4 + Dep5
...
Ind1 = Dep23 + Dep24 + Dep25 + Dep26
...
Ind5 = Dep4 + Dep5 + Dep10 + Dep17
...
```

essentially all the permutations possible

TIA.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I've managed to create a solution using the following macro:

```
%Macro Regression;
%let index = 1;
%do %until (%Scan(&Var2,&index," ")=);
%let Ind = %Scan(&Var2,&index," ");
proc reg data = quarterly;
model &Ind = &var / selection = stepwise;
quit;
%let index = %eval(&Index + 1);
%end;
%mend;
```

where Var1 and Var2 are the lists of 24 and 27 variables respectively

10 REPLIES 10

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Can you show the transreg code for ONE of your models that works as desired?

There are likely several approaches but I think it would help to show a single case before attempting to generate 24* (27 choose 4) cases (looks like 421,200 model runs).

Which is likely to take a moderate amount of time and some decisions of what to keep and identify the output.

How do you want to keep the output? Which specific output to keep for each model? Show this in your transreg example.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I've managed to create a solution using the following macro:

```
%Macro Regression;
%let index = 1;
%do %until (%Scan(&Var2,&index," ")=);
%let Ind = %Scan(&Var2,&index," ");
proc reg data = quarterly;
model &Ind = &var / selection = stepwise;
quit;
%let index = %eval(&Index + 1);
%end;
%mend;
```

where Var1 and Var2 are the lists of 24 and 27 variables respectively

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Please explain how that code is getting the groups of 4 independent variables from your original post:

the regressions I'm looking to trial would look like this:

`Ind1 = Dep1 + Dep2 + Dep3 + Dep4 Ind1 = Dep1 + Dep3 + Dep4 + Dep5 ... Ind1 = Dep23 + Dep24 + Dep25 + Dep26 ... Ind5 = Dep4 + Dep5 + Dep10 + Dep17 ...`

essentially all the permutations possible

Your posted code uses macro variables &var2 and &var, no &var1

If you are using Proc Reg you can actually have different MODEL statements though it is a good idea to include a Label for the model to key the output to the correct model.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

This seems like it would be terribly time consuming to run as it will have to compute 24 * comb(27,4) = 421,200 regressions, and furthermore it seems like it is a very poor idea to begin with. And even if you wrote the code (a rather time consuming task itself) and completed this series of regressions, what do you do with the results?

Better would be some method that evaluates models using all 27 independent variables and their predictive ability on the 24 dependent variables, and is designed to handle any possible collinearity between the independent variables (which ordinary least squares regression does not do), and is designed to handle any collinearity between the dependent variables (which ordinary least squares regression does not do). What method is that? Drumroll please! That method is Partial Least Squares regression, or PROC PLS in SAS. In the ideal case, you fit ONE model (that's right, one) and then interpret and use the results. Even if you have to iterate and remove outliers or remove variables and run the model again, I'm sure the number of regressions will be less than 421,200, in fact I would be willing to guess fewer than 10 iterations ( << 421,200) would get you to the final result. In addition, PLS will find 5 or 6 or 7 variable models that predict better (if they exist) than any of your 4 variable models. Seems like a no-brainer to me.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Explanation for the code above:

Var is a list of all the independent variables

Var2 is a list of all the dependent variables

the regression runs for each dependent variable a stepwise regression where it selects the independent variable with the most explanatory power, then if the second most explanatory independent variable is statistically significant, using an F-test, then it adds it and continues, otherwise it stops.

this means that I'm not running the full 27 independent variables but it selects the best and i'm looking to add a cap of 4 independent variables maximum.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@89974114 wrote:

Explanation for the code above:

Var is a list of all the independent variables

Var2 is a list of all the dependent variables

the regression runs for each dependent variable a stepwise regression where it selects the independent variable with the most explanatory power, then if the second most explanatory independent variable is statistically significant, using an F-test, then it adds it and continues, otherwise it stops.

this means that I'm not running the full 27 independent variables but it selects the best and i'm looking to add a cap of 4 independent variables maximum.

As I said, I would not advise this. I think better solutions exist.

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The code runs in <5 seconds

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

The resulting code with three variables are then taken to be discussed in which to carry forward for analysis based on R^2 and if the variables make economic sense (human input)

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

only 27 regressions are run in my code so far, each with 3 independent variables

but I will try out proc pls

but I will try out proc pls

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

we have put a cap on the number of independent variables on 3 , possibly 4

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.