BookmarkSubscribeRSS Feed
Geoff1
Fluorite | Level 6
Rick_SAS, thanks. Is there any way to get R-square, R-Square Change, F Change etc. as SPSS does to determine if the model is being improved by and by how much the model is being improved as new variables are entered?
Geoff
12 REPLIES 12
Ksharp
Super User
You are looking for variables selection method.
Check SELECT= option of MODEL statement.

Geoff1
Fluorite | Level 6

I don't see a "Select=" option. There is a Selection= which specifies a specific model to use e.g. None, Forward, Backward etc but I see no option for a Sequential or Hierarchical regression which would allow me to enter the variables in a specific order. Rick_SAS suggested the SEQB option which produced parameter estimates for each variable as it was entered in the regression but I'm not convinced that it actually conducted a sequential regression. Changing the order of the variables in the variable list did not change the parameter estimates in the SEQB output which makes me think the PROC REG is selecting the variable order based on R or a t-value and not necessarily on the order I want them.

Geoff

Ksharp
Super User

If you want "Changing the order of the variables in the variable list DID change the parameter estimates "
you could try PROC GLM .
Geoff1
Fluorite | Level 6
Hi, initially I was trying to analyze my data using ANCOVA but the covariate was dependent on the level of IV (a violation of the ANOVA assumption of independence). To overcome this, sequential regression was suggested to me to understand the influence of the covariate rather than trying to "correct " for it. I want to be sure that using the SEQB option does in fact produce a sequential analyses and not a standard or full regression. I don't see how SEQB is doing this if the parameter estimates are not dependent on the order in which the IV's are entered into the regression equation.
Geoff
Ksharp
Super User
Sorry. Maybe I misunderstood your question.
But from what you described , it looks like you need to resort to MIXED model.
PROC MIXED

Geoff1
Fluorite | Level 6
Hi, basically I want to do a sequential regression. I'm just not convinced that any of the PROC REG models or options actually does the regression sequentially as opposed to entering all the IV's together regardless of how I list them in the model statement. If you think the mixed model would work I would appreciate hearing additional details on how to do that. I currently use SAS Enterprise Guide but I can modify and write code.
Thanks
Geoff
Ksharp
Super User
Or you could try 

proc glmselect data=sashelp.cars;
model enginesize= cylinders horsepower invoice length/selection=none;
run;


Geoff1
Fluorite | Level 6

Hi, thanks for the suggestioon but it produced the exact same parameters as the PROC REG Selection=none model. Basically a full regression model and the order that the vriables were entered into the model didn't change the parameter estimates. Am I wrong to think that if a true sequential regression was being carried out then the order the variables are entered into the model should affect the parameter estimates? If not can somone provide an example where the parameter estimates change with the order of the variables.

Thanks

Geoff

Geoff1
Fluorite | Level 6
Hi, after consulting with Barbara Tabachnick (co-author of Using Multivariate Statistics) it turns out I was wrong about my expectations of what a sequential regression would look like. The final model will always have the same R2 and the same regression coefficients regardless of the order in which the IV's are entered. In SAS the easiest was to conduct a sequential regression is to do a series of regressions with each successive regression having the IV or IV's of interest added. The change in R2 is simply the difference in R2 between the two models and the F-change is calculated the same way as F except deltaR2 is used in the first part of the equation instead of R2.
Thanks For all the help.
Geoff
psyscience
Calcite | Level 5

Hi Geoff,

 

Do you happen to have syntax for the solution you found here in your last reply? I am also looking to do this type of model (except in proc surveylogistic) so I am taking your suggestion of running separate regressions, but I am struggling to obtain the R2 change, F, and p values for each separate part. As you know SPSS gives a p value for the change in R2 when you add your new variable(s), so this is what I am hoping to get. If you are able to help in any way please let me know, thank you!

mkeintz
PROC Star

@Geoff1    I don't know how you have chosen to code the sequential regressions, but here is a sample using sashelp.cars.  For nominal vars like TYPE and ORIGIN it makes groups of dummy vars.  Dummies for a nominal var are added as a group in the regression model (minus a dummy for the reference value).  

 

data have;
  set sashelp.cars;
  orig_asia=(origin='Asia');
  orig_europe=(origin='Europe');
  orig_usa=(origin='USA');

  type_hybrid=(type='Hybrid');
  type_suv=(type='SUV');
  type_sedan=(type='Sedan');
  type_sports=(type='Sports');
  type_truck=(type='Truck');
  type_wagon=(type='Wagon');
run;

proc reg data=have (drop=orig_usa type_sedan) 
       plots=none noprint outest=rsq_data rsquare;

  var mpg_highway weight horsepower cylinders orig_: type_: ;

  model mpg_highway=weight;
  run ;
  model mpg_highway=weight orig_: ;
  run ;
  model mpg_highway=weight orig_: horsepower ;
  run ;
  model mpg_highway=weight orig_: horsepower type_: ;
  run ;
quit;

data rsq_extended;
  set rsq_data;
  delta_rsq=dif(_rsq_);
  delta_indvars=dif(_in_);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
lizzdream12
Calcite | Level 5

@mkeintz Thanks for the codes. I am also working on an analysis using sequential regression. And I am trying to obtain the significance of change in R square and significance of ANOVA tests between models. May I know if there is specific code that you use to obtain the significance of the sequential models? 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 5963 views
  • 2 likes
  • 5 in conversation