Hi,
I have a robust regression that I am trying to run with multiple predictors. I'm using SAS University edition.
I see that proc glm produces a model summary table with an F-test and p-value for the overall model. However, proc robustreg does not produce such a table, and only seems to produce goodness-of-fit measures with no p-values. Currently, I am using this script:
Output from this script is attached. How do I get a F-test with p-value, or something similar with a p-value, for the overall model in proc robustreg when using multiple predictors? I don't see instructions for how to do this in any of the online content or the PDF on the procedure.
Thanks,
Alain
A test statistic and p-value requires knowing the sampling distribution of the parameter estimates. The sampling distributions for the robust R-squared and robust deviance are not known, which is why you do not see standard errors o p-values.
If necessary, you can use bootstrapping to approximate the sampling distribution and standard errors for these statistics. See the article "How to compute p-values for a bootstrap distribution."
Many thanks for the prompt reply. I understand now why there are no p-values associated with the goodness of fit measures provided by proc robustreg. If I understand correctly, other than this bootstrapping approach, there is no way to generate a test statistic with a p-value for the overall fit of the model when using a model with multiple regressors in proc robustreg?
I am not an expert on using ROBUSTREG, so I won't swear that there is not a way. However, I do not know of a way. If you searched the documentation and didn't find it, then I'd guess that ROBUSTREG can't produce it.
However, a related question is "what is the final weighted least squares model" after you remove outliers and downweight observations.
For that you can u se the OUTPUT statement and use the WEIGHT= option to output the final weights. Then use PROC GLM to run a weighted OLS regression. That will give you Type 3 F test for the final weighted OLS model, as follows
proc robustreg data=sashelp.cars method=MM FWLS;
model mpg_city = horsepower;
output out=RROut weight=w;
run;
proc glm data=RROut;
weight w;
model mpg_City = horsepower;
run;
I am not suggesting this approach (to be honest, I'm not sure what it means...) but I am just letting you know that it is possible. Maybe a regression expert will chime in.
You can get an overall model test (against hypothesis: all parameters are zero) in ROBUSTREG with the TEST statement :
... But I prefer to keep it simple: Use ROBUSTREG to identify outliers, take them out (report on them) and do a standard regression to fit the data:
proc robustreg data=sashelp.cars method=MM;
model mpg_city = horsepower weight / cutoff=4;
Overall: test horsepower weight;
output out=RROut outlier=outlier;
run;
proc glm data=RROut;
where not outlier;
model mpg_City = horsepower weight;
run;
Many thanks to both of you. PG Stats, iIn terms of interpreting the results of the Robust Linear Test output table, which value should I use Rho or Rn2 (attached)? Rn2 is significant while Rho very clearly is not. What is the rationale for trusting one value over the other?
If I use your alternative approach of identifying the outliers and running a standard regression, do I simply substitute the names of my dataset and variables? Do I go with a cutoff of 4? What other things do I need to change in the script?
The first test (Rho) is a robust version of the F test. So I guess it is the one you want.
The default cutoff value is 3 by default. I put 4 just to show it could be changed. What constitute an outlier is to a great extent a matter of opinion.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.