BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jlp2ba
Fluorite | Level 6

Hello, 

 

Release: 3.7 (Enterprise Edition)

Java Version: 1.7.0_151

 

I want to compare the change in F-value, R-squared value, t-value, etc of two regression models (one predictor, one criterion in each).  Right now I'm running two separate regression models as my way of comparison, but I'd like to see output that details the change as a result of running the second model which removes one observation that had high Cook's D. 

 

My code looks like this right now:

title2 'Model1 with DMUS_pre only';
proc reg data = wombat.popedataset2 plots(label)=(RStudentByLeverage CooksD);
  model dmus_post = dmus_pre / ss1 ss2 stb clb corrb influence r cli clm;
run;

*REMOVE HIGH COOK'S D STUDENT;
title2 'Model2 with DMUS_pre only MINUS ONE PARTICIPANT';
proc reg data = wombat.popedataset2 plots(label)=(RStudentByLeverage CooksD);
  model dmus_post = dmus_pre / ss1 ss2 stb clb corrb influence r cli clm;
  where participant_ ^= 804; 
run;

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@jlp2ba wrote:

PaigeMiller, 

 

I was wondering if there was a way to identify the change in r-squared, or if I can do model 2 r-sq minus model 1 r-sq.


I would recommend actually making a plot of the data with the two different regression lines shown, and see what the difference is.

--
Paige Miller

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26

I'm not aware of any statistical test to compare a regression on n observations with a regression run on n-1 observations. In fact, the idea of doing a statistical test in this instance seems to me to be unnecessary and improper. In fact, except for some trivial situations, the R-squared and t-value are different, not in the statistical hypothesis testing framework (which doesn't make sense here), but just different because they are different numbers.

--
Paige Miller
jlp2ba
Fluorite | Level 6

PaigeMiller, 

 

Thank you for your reply.  I'm a novice at this so it may be a poor question.  I've just noticed that removing the high-influence observation improved the fit and increased the r-squared value and F-value.  I was wondering if there was a way to identify the change in r-squared, or if I can do model 2 r-sq minus model 1 r-sq.

PaigeMiller
Diamond | Level 26

@jlp2ba wrote:

PaigeMiller, 

 

I was wondering if there was a way to identify the change in r-squared, or if I can do model 2 r-sq minus model 1 r-sq.


I would recommend actually making a plot of the data with the two different regression lines shown, and see what the difference is.

--
Paige Miller
PGStats
Opal | Level 21

Try running

 

proc robustreg data = wombat.popedataset2;
  model dmus_post = dmus_pre / diagnostics;
run;

to see if your suspected outlier is idenfified by the procedure.

 

PG

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1512 views
  • 3 likes
  • 3 in conversation