In PROC GLM, you can use the SOLUTION option and the CLPARM option in the MODEL statement to obtain the coefficient estimates and their confidence intervals.
Thank you so much for your quick reply.
I now tried the following:
PROC STDIZE DATA = dataset OUT = std_dataset;
VAR X
;
RUN;
ODS OUTPUT ParameterEstimates= ParameterEstimates;
PROC GLM data = std_dataset;
CLASS y1 y2;
MODEL X = y1 y2 y3 / SOLUTION CLPARM;
ODS SELECT ParameterEstimates;
QUIT;
...and it does works!
Typically, the independent variables used in a regression model are known as X1 X2 ... and the response is Y. You have reversed these. While this is not a problem for SAS if you remain consistent, you may have difficulty communicating what you are doing if you have Y predicting X which could lead to confusion (people are expecting X to predict Y).
However, your mistake is in PROC STDIZE, which should not be standardizing the response. It should be standardizing the independent and continuous predictor variables (not the dummy variables) and you would not standardize the response variable. So perhaps your naming scheme has confused you as well.
Hi Paige: Please look at "Standardized regression coefficients," which shows that you must standardize the response variable if you want to reproduce the results of the STB option in PROC REG.
Maybe splitting hairs here, but the OP did not specifically request to match the STB option in PROC REG. So in my mind, standardizing just the continuous independent variables and not the Y variables satisfies the original request, and allows comparisons of standardized regression coefficients. But the most recent code from @LB1993 doesn't do either your method or my method.
So sorry for the confusion caused and thanks again for your help!
Please read the article "Standardized regression coefficients" for an explanation of standardized regression coefficients and how to interpret them. Specifically, the article states, "the standardized coefficients predict the number of standard deviations that the response will change for one STANDARD DEVIATION of change in an explanatory variable." The concept of a "standard deviation" is generally applied to CONTINUOUS variables, not discrete classification variables. For example, if you include Sex = "Male" | "Female" as a classification variable in a model, it doesn't make sense to ask how the response changes for "one standard deviation of change in sex."
Consequently, the GLM procedure does not support the STB option that PROC REG uses to display standardized regression estimates. It is possible to perform the computation manually by storing the response variable and the design matrix, using PROC STDIZE as shown in the article, and then using the standardized variables in PROC REG. However, I don't think the result will be meaningful.
Thank you for sharing your thoughts on this. That is, you would refrain from calculating the standardised betas altogether and rather calculate the unstandardised betas and respective CIs?
Yes, that is what I meant.
But here's another idea that you might consider. If your explanatory variables are on vastly different scales, it makes sense to compute standardized coefficient estimates, but ONLY for the continuous variables. Maybe that's a reasonable compromise? To do that, follow the instructions in the article: use PROC STDIZE to standardize the response and the continuous regressors. Then specify the standardized variables and the (unstandardized) classification variables on the MODEL statement in PROC GLM. That will enable you to compare the size of the betas for the (standardized) continuous regressors. The coefficients for the classification variables will have their usual interpretations.
I had forgottenof PROC GLMSELECT. Nice find, @Ksharp !
Thanks so much for your help!
Is there a way to calculate stand. beta CIs via PROC GLMSELECT (so far I could not find any)?
What is the advantage of PROC GLMSELECT over the SOLUTION option and the CLPARM option in PROC GLM as mentioned by PaigeMiller?
Sorry.I have no idea about it.
I think the advantage of PROC GLMSELECT is you can get STB directly .especially when you have CLASS variable which PROC REG can't offer it .
data DrugTest; input Drug $ Gender $ X Y @@;
datalines;
A F 9 25 A F 3 19 A F 4 18 A F 11 28 A F 7 23 A M 11 27 A M 9 24 A M 9 25 A M 10 28 A M 10 26 D F 4 37 D F 12 54 D F 3 33 D F 6 41 D F 9 47 D M 5 36 D M 4 36 D M 7 40 D M 10 46 D M 8 42 G F 10 70 G F 11 75 G F 7 60 G F 9 69 G F 10 71 G M 3 47 G M 8 60 G M 11 70 G M 4 49 G M 4 50
;
ods show;
ods select ParameterEstimates;
ods show;
proc glm data=DrugTest;
class Drug Gender;
model Y = Drug Gender Drug*Gender /clparm solution;
quit;
proc glmselect data=DrugTest;
class Drug Gender;
model Y = Drug Gender Drug*Gender/ selection=none stb showpvalues;
run;
Thanks so much for your explanation and the screenshots!
That helped a lot!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.