turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- Base SAS Programming
- /
- changing the list of independent variables in step...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-04-2014 11:13 AM

I am trying to run two stepwise regressions sequentially. The first includes a set of indep variables X1, X2, and X3. Once I know which of those are meaningful, I want to estimate a second regression that includes the ones that are meaningful plus a number of other independent variables (X4, X5, X6).

For example, the first stage regression is:

proc reg outest=model_coef; model Y=X1 X2 X3 / selection=stepwise; by firm;

For firm A, only X1 is meaningful

For firm B, X1 and X3 are meaningful

Thus, I'd like the second regression to read:

For firm A: proc reg; model Y=X1 X4 X5 X6 / selection=stepwise include=1;

For firm B: proc reg; model Y=X1 X3 X4 X5 X6 / selection=stepwise include=2;

Note that the regression for B has two changes from the regression for A - the list of independent variables changes, and the "include" number changes from 1 to 2.

I can, of course, see which variables enter the model (from outest=model_coef), but I can seem to figure out how to move from that information to the second stage regression.

Any suggestions? Thanks!

Accepted Solutions

Solution

07-04-2014
03:15 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to coug914

07-04-2014 03:15 PM

Problem is with your macro variables. I've made some changes in data step (highlighted) run this to have your macro variables populated then proceed to proc reg.

data test2; set outtest1;

if X1 ne . then X1_in_model='X1 ';

if X2 ne . then X2_in_model='X2 ';

if X3 ne . then X3_in_model='X3 ';

vars=cat(X1_in_model,X2_in_model,X3_in_model);

numvars=n(X1,X2,X3);

**call symputx('macrovar',vars);**

**call symputx('macronum',numvars);**

keep firm_id ¯ovar ¯onum;

proc print; run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to coug914

07-04-2014 11:32 AM

How you determine variables X1,X3 are meaningful based on first regression?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

07-04-2014 11:38 AM

The default " /selection=stepwise " requires a variable to statistically significant at the 0.15 level for entry into the model.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to coug914

07-04-2014 11:51 AM

Right, this is the internal processing of proc reg to include an independent variable in the model. My question was relating to X1 and X3, which are being considering to include in the second step regression. How will you decide these variables should go to the second step regression? This will give some baseline to filter these variable from the first step regression which is your requirement right?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

07-04-2014 12:00 PM

Perhaps I am misunderstanding your question. The results from the first step might look like this:

Obs _MODEL_ _TYPE_ _DEPVAR_ _RMSE_ Intercept X1 x2 X3 y _IN_ _P_ _EDF_ _RSQ_

1 MODEL1 PARMS y 0.094107 0.010854 0.83273 . -0.44081 -1 2 3 33 0.47322

In this case, SAS tells me that variables X1 and X3 are included in the first step regression and I would now like to include them in the second step regression. I was thinking I might be able to create macro variables based on that output (a list of the meaningful variables and the number of meaningful variables) and then feed that list back into the second step regression. I can create those variables, but I can't figure out how to feed it back in. I'm wide open to any other ideas.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to coug914

07-04-2014 12:42 PM

Now this is more clear. Idea is right just make a macro variable to assign variables X1 and X3 and put them back in the second step regression like

proc reg; model Y=&vars X4 X5 X6 / selection=stepwise include=2;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

07-04-2014 01:53 PM

That's the part I'm having an issue with. For example:

data test1; set test0;

proc reg outest=outtest1; model y=x1 x2 x3 / Selection=Stepwise rsquare; by firm_id;

run;

data test2; set outtest1;

if X1 ne . then X1_in_model='X1 ';

if X2 ne . then X2_in_model='X2 ';

if X3 ne . then X3_in_model='X3 ';

vars=cat(X1_in_model,X2_in_model,X3_in_model);

numvars=n(X1,X2,X3);

%let macrovar=vars;

%let macronum=numvars;

keep firm_id ¯ovar ¯onum;

proc print; run;

/*************

the output here looks this:

Obs firm_id vars numvars

1 70740 X1 X3 2

2 75160 X1 1

3 76695 X1 X2 2

************/

*now how do i get those variables back into the next regression? This does not work (since it sees "vars" as data rather than variable names). Any suggestions? ;

data test4; merge test1 test2; by firm_id;

proc reg outest=outest4; model y=¯ovar X4 X5 X6 / selection=stepwise include=¯onum;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to coug914

07-04-2014 02:39 PM

Try this way.

data test4; merge test1 test2; by firm_id;

proc reg data=test4 outest=outest4; model y=¯ovar X4 X5 X6 / selection=stepwise include=¯onum;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

07-04-2014 02:42 PM

Isn't that the same as I had?

Results in same error:

593 proc reg data=test4 outest=outest4; model y=¯ovar X4 X5 X6 / selection=stepwise include=¯onum;

ERROR: Variable vars in list does not match type prescribed for this list.

NOTE: Line generated by the macro variable "MACRONUM".

1 numvars

-------

22

202

ERROR 22-322: Expecting an integer constant.

ERROR 202-322: The option or parameter is not recognized and will be ignored.

594 run;

NOTE: The previous statement has been deleted.

WARNING: The variable _NAME_ or _TYPE_ exists in a data set that is not TYPE=CORR, COV, SSCP, etc.

WARNING: No variables specified for an SSCP matrix. Execution terminating.

NOTE: PROCEDURE REG used (Total process time):

real time 0.01 seconds

cpu time 0.01 seconds

NOTE: The data set WORK.OUTEST4 has 0 observations and 4 variables.

Solution

07-04-2014
03:15 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to coug914

07-04-2014 03:15 PM

Problem is with your macro variables. I've made some changes in data step (highlighted) run this to have your macro variables populated then proceed to proc reg.

data test2; set outtest1;

if X1 ne . then X1_in_model='X1 ';

if X2 ne . then X2_in_model='X2 ';

if X3 ne . then X3_in_model='X3 ';

vars=cat(X1_in_model,X2_in_model,X3_in_model);

numvars=n(X1,X2,X3);

**call symputx('macrovar',vars);**

**call symputx('macronum',numvars);**

keep firm_id ¯ovar ¯onum;

proc print; run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to stat_sas

07-04-2014 03:35 PM

Thanks. Very helpful. Reading about the call symputx - although I'ms still no sure I fully understand what's going on.

what's the difference between

%let macrovar=vars;

and

**call symputx('macrovar',vars);**

Reading about call symputx it says it creates a macro variable called macrovar from vars. How is that different from the %let?

Thanks again for your help on this!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

Posted in reply to coug914

07-04-2014 03:52 PM

symputx assigns value produced in a DATA step to a macro-variable. %LET is used in open code, not inside a datastep or proc.