BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
coug914
Calcite | Level 5

I am trying to run two stepwise regressions sequentially. The first includes a set of indep variables X1, X2, and X3. Once I know which of those are meaningful, I want to estimate a second regression that includes the ones that are meaningful plus a number of other independent variables (X4, X5, X6).

For example, the first stage regression is:

proc reg outest=model_coef; model Y=X1 X2 X3 / selection=stepwise; by firm;

For firm A, only X1 is meaningful

For firm B, X1 and X3 are meaningful

Thus, I'd like the second regression to read:

For firm A: proc reg; model Y=X1 X4 X5 X6 / selection=stepwise include=1;

For firm B: proc reg; model Y=X1 X3 X4 X5 X6 / selection=stepwise include=2;

Note that the regression for B has two changes from the regression for A - the list of independent variables changes, and the "include" number changes from 1 to 2.

I can, of course, see which variables enter the model (from outest=model_coef), but I can seem to figure out how to move from that information to the second stage regression.

Any suggestions? Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
stat_sas
Ammonite | Level 13

Problem is with your macro variables. I've made some changes in data step (highlighted) run this to have your macro variables populated then proceed to proc reg.

data test2; set outtest1;

if X1 ne . then X1_in_model='X1 ';

if X2 ne . then X2_in_model='X2 ';

if X3 ne . then X3_in_model='X3 ';

vars=cat(X1_in_model,X2_in_model,X3_in_model);

numvars=n(X1,X2,X3);

call symputx('macrovar',vars);

call symputx('macronum',numvars);

keep firm_id &macrovar &macronum;

proc print; run;

View solution in original post

11 REPLIES 11
stat_sas
Ammonite | Level 13

How you determine variables X1,X3 are meaningful based on first regression?

coug914
Calcite | Level 5

The default " /selection=stepwise " requires a variable to statistically significant at the 0.15 level for entry into the model.

stat_sas
Ammonite | Level 13

Right, this is the internal processing of proc reg to include an independent variable in the model. My question was relating to X1 and X3, which are being considering to include in the second step regression. How will you decide these variables should go to the second step regression? This will give some baseline to filter these variable from the first step regression which is your requirement right?

coug914
Calcite | Level 5

Perhaps I am misunderstanding your question. The results from the first step might look like this:

Obs   _MODEL_   _TYPE_   _DEPVAR_    _RMSE_    Intercept      X1     x2      X3       y   _IN_   _P_   _EDF_    _RSQ_

1    MODEL1    PARMS       y       0.094107    0.010854   0.83273    .   -0.44081   -1     2     3      33    0.47322

In this case, SAS tells me that variables X1 and X3 are included in the first step regression and I would now like to include them in the second step regression. I was thinking I might be able to create macro variables based on that output (a list of the meaningful variables and the number of meaningful variables) and then feed that list back into the second step regression. I can create those variables, but I can't figure out how to feed it back in. I'm wide open to any other ideas.

stat_sas
Ammonite | Level 13

Now this is more clear. Idea is right just make a macro variable to assign variables X1 and X3 and put them back in the second step regression like

proc reg; model Y=&vars X4 X5 X6 / selection=stepwise include=2;

coug914
Calcite | Level 5

That's the part I'm having an issue with. For example:

data test1; set test0;

proc reg outest=outtest1; model y=x1 x2 x3 / Selection=Stepwise rsquare;  by firm_id;

run;

data test2; set outtest1;

if X1 ne . then X1_in_model='X1 ';

if X2 ne . then X2_in_model='X2 ';

if X3 ne . then X3_in_model='X3 ';

vars=cat(X1_in_model,X2_in_model,X3_in_model);

numvars=n(X1,X2,X3);

%let macrovar=vars;

%let macronum=numvars;

keep firm_id &macrovar &macronum;

proc print; run;

/*************

the output here looks this:

Obs    firm_id    vars        numvars

1      70740     X1    X3       2

2      75160     X1             1

3      76695     X1 X2          2

************/

*now how do i get those variables back into the next regression? This does not work (since it sees "vars" as data rather than variable names). Any suggestions? ;

data test4; merge test1 test2; by firm_id;

proc reg outest=outest4; model y=&macrovar X4 X5 X6 / selection=stepwise include=&macronum;

run;

stat_sas
Ammonite | Level 13

Try this way.

data test4; merge test1 test2; by firm_id;

proc reg data=test4 outest=outest4; model y=&macrovar X4 X5 X6 / selection=stepwise include=&macronum;

run;

coug914
Calcite | Level 5

Isn't that the same as I had?

Results in same error:

593  proc reg data=test4 outest=outest4; model y=&macrovar X4 X5 X6 / selection=stepwise include=&macronum;

ERROR: Variable vars in list does not match type prescribed for this list.

NOTE: Line generated by the macro variable "MACRONUM".

1    numvars

     -------

     22

     202

ERROR 22-322: Expecting an integer constant.

ERROR 202-322: The option or parameter is not recognized and will be ignored.

594  run;

NOTE: The previous statement has been deleted.

WARNING: The variable _NAME_ or _TYPE_ exists in a data set that is not TYPE=CORR, COV, SSCP, etc.

WARNING: No variables specified for an SSCP matrix. Execution terminating.

NOTE: PROCEDURE REG used (Total process time):

      real time           0.01 seconds

      cpu time            0.01 seconds

NOTE: The data set WORK.OUTEST4 has 0 observations and 4 variables.

stat_sas
Ammonite | Level 13

Problem is with your macro variables. I've made some changes in data step (highlighted) run this to have your macro variables populated then proceed to proc reg.

data test2; set outtest1;

if X1 ne . then X1_in_model='X1 ';

if X2 ne . then X2_in_model='X2 ';

if X3 ne . then X3_in_model='X3 ';

vars=cat(X1_in_model,X2_in_model,X3_in_model);

numvars=n(X1,X2,X3);

call symputx('macrovar',vars);

call symputx('macronum',numvars);

keep firm_id &macrovar &macronum;

proc print; run;

coug914
Calcite | Level 5

Thanks. Very helpful. Reading about the call symputx - although I'ms still no sure I fully understand what's going on.

what's the difference between

%let macrovar=vars;

and

call symputx('macrovar',vars);

Reading about call symputx it says it creates a macro variable called macrovar from vars. How is that different from the %let?

Thanks again for your help on this!

stat_sas
Ammonite | Level 13

symputx assigns value produced in a DATA step to a macro-variable. %LET is used in open code, not inside a datastep or proc.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2941 views
  • 0 likes
  • 2 in conversation