BookmarkSubscribeRSS Feed
alainc99
Calcite | Level 5

Hi,

 

I have a robust regression that I am trying to run with multiple predictors. I'm using SAS University edition.

 

I see that proc glm produces a model summary table with an F-test and p-value for the overall model. However, proc robustreg does not produce such a table, and only seems to produce goodness-of-fit measures with no p-values. Currently, I am using this script:

PROC ROBUSTREG DATA=SASDATA.ROBUST_FINAL;
MODEL Mean1_age_residual = MAP hbA1c LDL;
run;

 

Output from this script is attached. How do I get a F-test with p-value, or something similar with a p-value, for the overall model in proc robustreg when using multiple predictors? I don't see instructions for how to do this in any of the online content or the PDF on the procedure.

 

Thanks,
Alain

7 REPLIES 7
Rick_SAS
SAS Super FREQ

 A test statistic and p-value requires knowing the sampling distribution of the parameter estimates.  The sampling distributions for the robust R-squared and robust deviance are not known, which is why you do not see standard errors o p-values.

 

If necessary, you can use bootstrapping to approximate the sampling distribution and standard errors for these statistics. See the article "How to compute p-values for a bootstrap distribution."

alainc99
Calcite | Level 5

Many thanks for the prompt reply. I understand now why there are no p-values associated with the goodness of fit measures provided by proc robustreg. If I understand correctly, other than this bootstrapping approach, there is no way to generate a test statistic with a p-value for the overall fit of the model when using a model with multiple regressors in proc robustreg?

Rick_SAS
SAS Super FREQ

I am not an expert on using ROBUSTREG, so I won't swear that there is not a way.  However, I do not know of a way. If you searched the documentation and didn't find it, then I'd guess that ROBUSTREG can't produce it.

 

However, a related question is "what is the final weighted least squares model" after you remove outliers and downweight observations.

For that you can u se the OUTPUT statement and use the WEIGHT= option to output the final weights. Then use PROC GLM to run a weighted OLS regression. That will give you Type  3 F test  for the final weighted OLS model, as follows

 

proc robustreg data=sashelp.cars method=MM FWLS;
model mpg_city  = horsepower;
output  out=RROut weight=w;
run;

proc glm data=RROut;
weight w;
model mpg_City  = horsepower;
run;

I am not suggesting this approach (to be honest, I'm not sure what it means...) but I am just letting you know that it is possible.  Maybe a  regression expert will chime in.

PGStats
Opal | Level 21

You can get an overall model test (against hypothesis: all parameters are zero) in ROBUSTREG with the TEST statement :

 

PROC ROBUSTREG DATA=SASDATA.ROBUST_FINAL;
MODEL Mean1_age_residual = MAP hbA1c LDL;
overall: TEST MAP hbA1c LDL;
run;
PG
PGStats
Opal | Level 21

...  But I prefer to keep it simple: Use ROBUSTREG to identify outliers, take them out (report on them) and do a standard regression to fit the data:

 

proc robustreg data=sashelp.cars method=MM;
model mpg_city  = horsepower weight / cutoff=4;
Overall: test horsepower weight;
output  out=RROut outlier=outlier;
run;

proc glm data=RROut;
where not outlier;
model mpg_City  = horsepower weight;
run;
PG
alainc99
Calcite | Level 5

Many thanks to both of you. PG Stats, iIn terms of interpreting the results of the Robust Linear Test output table, which value should I use Rho or Rn2 (attached)? Rn2 is significant while Rho very clearly is not. What is the rationale for trusting one value over the other?

If I use your alternative approach of identifying the outliers and running a standard regression, do I simply substitute the names of my dataset and variables? Do I go with a cutoff of 4? What other things do I need to change in the script?

PGStats
Opal | Level 21

See http://support.sas.com/documentation/cdl/en/statug/68162/HTML/default/statug_rreg_details01.htm#stat...

The first test (Rho) is a robust version of the F test. So I guess it is the one you want.

 

The default cutoff value is 3 by default. I put 4 just to show it could be changed. What constitute an outlier is to a great extent a matter of opinion.

PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 1746 views
  • 2 likes
  • 3 in conversation