BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
BMI_kid
Fluorite | Level 6

proc glmselect data=imputed PLOTS=ALL;
*class NoEvalBus NoEvalComp;
model Responce=&cluster
/ selection=stepwise(select=sl) hierarchy=single
stats=all showpvalues orderselect stb;
score out=predStepwise predicted residual;
run;

 

If i run this code, how do i interpret the output. I am interested in the equation for the model, and I wonder if it fits the same model as the default model fitted by proc genmod since they both fit "Generalised linear models."

1 ACCEPTED SOLUTION

Accepted Solutions
StatDave
SAS Super FREQ

GLMSELECT fits the "general linear model" that assumes that the response distribution is normal and it directly models the response mean. GENMOD fits the "generalized linear model" which allows for any response distribution in a family of distributions and it models a function (the "link" function) of the response mean. If you don't use model selection in GLMSELECT, it will fit the same model as GENMOD when you select the normal response distribution and the identity link function, though different estimation methods are used. 

 

The estimated parameters (some call them "weights") of the model can be found in the "Parameter Estimates" table from GLMSELECT. The model is then  mean(y)=b0+b1*x1+ ... similar to the quoted text you show.

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

I wonder if it fits the same model as the default model fitted by proc genmod since they both fit "Generalised linear models."

 

I think it would be very unlikely (but not impossible) that the stepwise model created by GLMSELECT would match the (non-stepwise) model created by GENMOD. Nevertheless, you can look at the model coefficient estimates yourself and see if it is the same model.

 

If i run this code, how do i interpret the output. 

 

You would probably want to take a look at the examples for PROC GLMSELECT in the SAS documentation to see how to interpret the various outputs. If you have a specific question about the output, show us a screen capture of the output and ask your specific question.

--
Paige Miller
BMI_kid
Fluorite | Level 6

I am looking for the equation of the model maybe for instance 

 

BMI_kid_0-1698236813975.png

 

I want to be able to explain the parameter estimates/coefficients and maybe give weights to the effect of the predictors.

PaigeMiller
Diamond | Level 26

@BMI_kid wrote:

I am looking for the equation of the model maybe for instance 

 

BMI_kid_0-1698236813975.png

 

I want to be able to explain the parameter estimates/coefficients and maybe give weights to the effect of the predictors.


Please be specific. These beta values are slopes. What else about the parameter estimates do you need to explain? What do you mean by "give weights" to the effect of the predictors?

--
Paige Miller
sbxkoenk
SAS Super FREQ

Hello,

 

To be clear:
You don't need that explicit equation

  • to understand your model (full interpretation)
  • nor to score (in the future) new data with your model.

With respect to the former point:

I'm so used to it by now ... the way SAS presents parameter estimates.

I see the equation (or equations) in front of me like this ... in my head.
It is a matter of getting used to it.

 

With respect to the latter point:

Some people ask this question (your question) so that, with that fully written out equation, they can build a scoring app in Excel. I (and not only me) would strongly advise against that.

>> Just do the scoring in SAS. Plenty of ways to do this and you are sure results will be entirely correct.

 

BR, Koen

StatDave
SAS Super FREQ

GLMSELECT fits the "general linear model" that assumes that the response distribution is normal and it directly models the response mean. GENMOD fits the "generalized linear model" which allows for any response distribution in a family of distributions and it models a function (the "link" function) of the response mean. If you don't use model selection in GLMSELECT, it will fit the same model as GENMOD when you select the normal response distribution and the identity link function, though different estimation methods are used. 

 

The estimated parameters (some call them "weights") of the model can be found in the "Parameter Estimates" table from GLMSELECT. The model is then  mean(y)=b0+b1*x1+ ... similar to the quoted text you show.

Rick_SAS
SAS Super FREQ

It's always best to use example data that everyone can run. Let's use the numerical variables in the Sashelp.cars data set. For the response variable, I will choose MPG_City. For the independent variables, I will choose five numerical variables:

 

%let cluster = EngineSize Horsepower Weight Wheelbase Length;

proc glmselect data=sashelp.cars plots=none;
   model MPG_City =&cluster
      / selection=stepwise(select=sl) hierarchy=single
      stats=all showpvalues orderselect stb;
   score out=predStepwise predicted residual;
   ODS SELECT  SelectedEffects ParameterEstimates;
run

Rick_SAS_0-1698324190577.png

For this example, two of the five candidate effects were selected for the final model, along with the Intercept term. The OLS model that best predicts the mean response (conditional on the two selected indep variables) is 
MPG_City = 38.34 - 0.00357*Weight - 0.0256*Horsepower

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1070 views
  • 4 likes
  • 5 in conversation