12-12-2016 06:42 PM
I have a robust regression that I am trying to run with multiple predictors. I'm using SAS University edition.
I see that proc glm produces a model summary table with an F-test and p-value for the overall model. However, proc robustreg does not produce such a table, and only seems to produce goodness-of-fit measures with no p-values. Currently, I am using this script:
Output from this script is attached. How do I get a F-test with p-value, or something similar with a p-value, for the overall model in proc robustreg when using multiple predictors? I don't see instructions for how to do this in any of the online content or the PDF on the procedure.
12-12-2016 07:33 PM
A test statistic and p-value requires knowing the sampling distribution of the parameter estimates. The sampling distributions for the robust R-squared and robust deviance are not known, which is why you do not see standard errors o p-values.
If necessary, you can use bootstrapping to approximate the sampling distribution and standard errors for these statistics. See the article "How to compute p-values for a bootstrap distribution."
12-12-2016 08:03 PM
Many thanks for the prompt reply. I understand now why there are no p-values associated with the goodness of fit measures provided by proc robustreg. If I understand correctly, other than this bootstrapping approach, there is no way to generate a test statistic with a p-value for the overall fit of the model when using a model with multiple regressors in proc robustreg?
12-12-2016 09:19 PM
I am not an expert on using ROBUSTREG, so I won't swear that there is not a way. However, I do not know of a way. If you searched the documentation and didn't find it, then I'd guess that ROBUSTREG can't produce it.
However, a related question is "what is the final weighted least squares model" after you remove outliers and downweight observations.
For that you can u se the OUTPUT statement and use the WEIGHT= option to output the final weights. Then use PROC GLM to run a weighted OLS regression. That will give you Type 3 F test for the final weighted OLS model, as follows
proc robustreg data=sashelp.cars method=MM FWLS; model mpg_city = horsepower; output out=RROut weight=w; run; proc glm data=RROut; weight w; model mpg_City = horsepower; run;
I am not suggesting this approach (to be honest, I'm not sure what it means...) but I am just letting you know that it is possible. Maybe a regression expert will chime in.
12-12-2016 10:07 PM
You can get an overall model test (against hypothesis: all parameters are zero) in ROBUSTREG with the TEST statement :
12-12-2016 10:16 PM
... But I prefer to keep it simple: Use ROBUSTREG to identify outliers, take them out (report on them) and do a standard regression to fit the data:
proc robustreg data=sashelp.cars method=MM; model mpg_city = horsepower weight / cutoff=4; Overall: test horsepower weight; output out=RROut outlier=outlier; run; proc glm data=RROut; where not outlier; model mpg_City = horsepower weight; run;
12-13-2016 12:33 AM
Many thanks to both of you. PG Stats, iIn terms of interpreting the results of the Robust Linear Test output table, which value should I use Rho or Rn2 (attached)? Rn2 is significant while Rho very clearly is not. What is the rationale for trusting one value over the other?
If I use your alternative approach of identifying the outliers and running a standard regression, do I simply substitute the names of my dataset and variables? Do I go with a cutoff of 4? What other things do I need to change in the script?
12-13-2016 01:00 AM
The first test (Rho) is a robust version of the F test. So I guess it is the one you want.
The default cutoff value is 3 by default. I put 4 just to show it could be changed. What constitute an outlier is to a great extent a matter of opinion.