Re: How do I find overall model fit for proc robustreg?

alainc99 · Posted 12-12-2016 06:42 PM

Hi,

I have a robust regression that I am trying to run with multiple predictors. I'm using SAS University edition.

I see that proc glm produces a model summary table with an F-test and p-value for the overall model. However, proc robustreg does not produce such a table, and only seems to produce goodness-of-fit measures with no p-values. Currently, I am using this script:

PROC ROBUSTREG DATA=SASDATA.ROBUST_FINAL;

MODEL Mean1_age_residual = MAP hbA1c LDL;

run;

Output from this script is attached. How do I get a F-test with p-value, or something similar with a p-value, for the overall model in proc robustreg when using multiple predictors? I don't see instructions for how to do this in any of the online content or the PDF on the procedure.

Thanks,
Alain

Rick_SAS · Posted 12-12-2016 07:33 PM

A test statistic and p-value requires knowing the sampling distribution of the parameter estimates. The sampling distributions for the robust R-squared and robust deviance are not known, which is why you do not see standard errors o p-values.

If necessary, you can use bootstrapping to approximate the sampling distribution and standard errors for these statistics. See the article "How to compute p-values for a bootstrap distribution."

alainc99 · Posted 12-12-2016 08:03 PM

Many thanks for the prompt reply. I understand now why there are no p-values associated with the goodness of fit measures provided by proc robustreg. If I understand correctly, other than this bootstrapping approach, there is no way to generate a test statistic with a p-value for the overall fit of the model when using a model with multiple regressors in proc robustreg?

Rick_SAS · Posted 12-12-2016 09:19 PM

I am not an expert on using ROBUSTREG, so I won't swear that there is not a way. However, I do not know of a way. If you searched the documentation and didn't find it, then I'd guess that ROBUSTREG can't produce it.

However, a related question is "what is the final weighted least squares model" after you remove outliers and downweight observations.

For that you can u se the OUTPUT statement and use the WEIGHT= option to output the final weights. Then use PROC GLM to run a weighted OLS regression. That will give you Type 3 F test for the final weighted OLS model, as follows

proc robustreg data=sashelp.cars method=MM FWLS;
model mpg_city  = horsepower;
output  out=RROut weight=w;
run;

proc glm data=RROut;
weight w;
model mpg_City  = horsepower;
run;

I am not suggesting this approach (to be honest, I'm not sure what it means...) but I am just letting you know that it is possible. Maybe a regression expert will chime in.

PGStats · Posted 12-12-2016 10:07 PM

You can get an overall model test (against hypothesis: all parameters are zero) in ROBUSTREG with the TEST statement :

PROC ROBUSTREG DATA=SASDATA.ROBUST_FINAL;

MODEL Mean1_age_residual = MAP hbA1c LDL;

overall: TEST MAP hbA1c LDL;

run;

PG

PGStats · Posted 12-12-2016 10:16 PM

... But I prefer to keep it simple: Use ROBUSTREG to identify outliers, take them out (report on them) and do a standard regression to fit the data:

proc robustreg data=sashelp.cars method=MM;
model mpg_city  = horsepower weight / cutoff=4;
Overall: test horsepower weight;
output  out=RROut outlier=outlier;
run;

proc glm data=RROut;
where not outlier;
model mpg_City  = horsepower weight;
run;

PG

alainc99 · Posted 12-13-2016 12:33 AM

Many thanks to both of you. PG Stats, iIn terms of interpreting the results of the Robust Linear Test output table, which value should I use Rho or Rn2 (attached)? Rn2 is significant while Rho very clearly is not. What is the rationale for trusting one value over the other?

If I use your alternative approach of identifying the outliers and running a standard regression, do I simply substitute the names of my dataset and variables? Do I go with a cutoff of 4? What other things do I need to change in the script?

PGStats · Posted 12-13-2016 01:00 AM

See http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/statug_rreg_details01.htm#stat...

The first test (Rho) is a robust version of the F test. So I guess it is the one you want.

The default cutoff value is 3 by default. I put 4 just to show it could be changed. What constitute an outlier is to a great extent a matter of opinion.

PG

SAS Innovate 2025: Save the Date