BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
trungdungtran
Obsidian | Level 7

Hi all,

 

Using proc reg, we can easily obtain the CI for beta parameters. One of the output now I would like to obtain the CI, that is the root MSE. Root MSE is an estimate for the standard deviation of the measurement errors.

 

Proc reg data=lm1 plots=none;
Model y=x/clb;
Run;

 

Could you tell me how to obtain the CI for it?

 

MSE.png

 

Kind regards,

Trung Dung.

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

I'm not aware of a way to compute the confidence interval for root MSE in SAS. Apparently, Google is not aware of a way to do this either.

 

You could always use bootstrap or jackknife methods to obtain confidence intervals for any estimate.

--
Paige Miller

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

I'm not aware of a way to compute the confidence interval for root MSE in SAS. Apparently, Google is not aware of a way to do this either.

 

You could always use bootstrap or jackknife methods to obtain confidence intervals for any estimate.

--
Paige Miller
trungdungtran
Obsidian | Level 7

Yes, I did google and I cannot find anything like that, but at least I know that it is not an available option in SAS.

 

Thank you for your reply @PaigeMiller.

FreelanceReinh
Jade | Level 19

Hi @trungdungtran,

 

As you say correctly, Root MSE is an estimate for the standard deviation s of the error term in the linear regression model. So, given this point estimate, it's a reasonable question to ask for a confidence interval estimate -- for s, not for Root MSE. (Edit: Note that this is perfectly analogous to the CIs you mentioned. They are for the true parameters, not for the estimates.)

 

Under the usual normality assumption, i.e., the error term has a normal distribution with mean 0 and standard deviation s, an exact confidence interval for s can be computed easily from the error sum of squares, the corresponding degrees of freedom (see PROC REG output) and quantiles of the corresponding chi-square distribution.

 

Example:

ods output anova=anova(keep=source df ss where=(source='Error'))
           fitstatistics=fs(keep=label1 nvalue1 where=(label1='Root MSE') rename=(nvalue1=Root_MSE));

proc reg data=sashelp.class plots=none;
model weight=height / clb;
quit;

%let alpha=0.05;

data _null_;
set anova;
set fs;
lcl=sqrt(ss/cinv(1-&alpha/2,df)); /* lower (1-&alpha)*100% confidence limit for sigma */
ucl=sqrt(ss/cinv(&alpha/2,df));   /* upper (1-&alpha)*100% confidence limit for sigma */
put (lcl Root_MSE ucl) (=6.3);
run;

Result:

lcl=8.424 Root_MSE=11.226 ucl=16.830

 

Reference: S.R. Searle, Linear Models (cf. page 414, formula 59).

 

Still not convinced? Perform a simulation:

%let alpha=0.05;
%let b0=-143;     /* (arbitrary) intercept */
%let b1=3.9;      /* (arbitrary) slope */
%let sigma=12.34; /* (arbitrary) error standard deviation */

/* Simulate 100000 datasets with WGT values calculated from HEIGHT 
   using a linear regression model */

data sim;
call streaminit(27182818);
set sashelp.class(keep=height);
do i=1 to 100000;
  wgt=&b0+&b1*height+rand('norm',0,&sigma);
  output;
end;
run;

proc sort data=sim;
by i;
run;

/* Perform regression analyses */

ods exclude all;
ods noresults;
ods output anova=anova(keep=i source df ss where=(source='Error'));
proc reg data=sim plots=none;
by i;
model wgt=height;
quit;
ods exclude none;
ods results;

/* Compute (1-&alpha)*100% confidence intervals for &sigma 
   and determine if true value is covered or not (c=1 | c=0) */

data chk;
set anova;
lcl=sqrt(ss/cinv(1-&alpha/2,df));
ucl=sqrt(ss/cinv(&alpha/2,df));
c=(lcl<=&sigma<=ucl);
run;

/* Estimate coverage probability */

ods exclude BinomialTest;
proc freq data=chk;
tables c / binomial(level='1');
run;

Result:

                              Cumulative    Cumulative
c    Frequency     Percent     Frequency      Percent
------------------------------------------------------
0        5025        5.03          5025         5.03
1       94975       94.98        100000       100.00


      Binomial Proportion
             c = 1

Proportion                0.9498
ASE                       0.0007
95% Lower Conf Limit      0.9484
95% Upper Conf Limit      0.9511

Exact Conf Limits
95% Lower Conf Limit      0.9484
95% Upper Conf Limit      0.9511

Sample Size = 100000

The result is what you'd expect with a true 95% coverage probability. 

trungdungtran
Obsidian | Level 7

Thank @FreelanceReinh, you understand my question more than what I wrote.

 

Actually, I am learning macro to simulate data and assess the performance of the model. I start with a linear regression model with one covariate. Three parameters are involved: intercept, slope, and sigma. I can do for beta's but for sigma I only know that root MSE is an estimate for sigma.

 

Now you answer helps me to obtain the CI for sigma also. From that, I can compute the coverage probability, what I had done for beta's.

 

I appreciate your help!

FreelanceReinh
Jade | Level 19

You're welcome, @trungdungtran.


@trungdungtran wrote:

Actually, I am learning macro to simulate data and assess the performance of the model.


Macros and simulation? Make sure you read Rick Wicklin's blog post "Simulation in SAS: The slow way or the BY way" the sooner the better. (Quote: "Never use a macro loop to create a simulation.") There's also a more specific article in Rick's blog: "Simulate data for a linear regression model".

trungdungtran
Obsidian | Level 7

Thank you @FreelanceReinh for suggestion about the blog post.

 

I am learning macro so I take this as an exercise for me to practice.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2397 views
  • 0 likes
  • 3 in conversation