BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Shivi82
Quartz | Level 8

Dear All,

 

I have 2 regression (linear) models and i am comparing the results of those.in both the cases independent variable is weight and the dependent var  is price

 

MODEL 1:

The slope for weight in model 1 is 1.392 with a significance value of .000 & standard error of .009

 

MODEL 2:

The slope for weight in model 1 is 1.375 with a significance value of .000 & standard error of .008

 

 

From the above output in both the models, weight is a good predictor var at alpha of 0.5 however as there is a very slight variance in both the value so does this signify anything. Please note that in first model there is no other predictor variable whereas in model 2 i have another 2 variables.

 

With the inclusion of additional var (IV) it should have impacted my R-square not the slope value. Kindly suggest.

 

Thanks, Shivi

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

Including additional independent variables can change the coefficient of weight, unless the additional independent variables are orthogonal to weight (unlikely, unless it is a designed experiment).

 

I suppose you could still do a statistical comparison of the slopes, but I think that's sort of meaningless here, you know the slopes will be different because of the additional two independent variables.

--
Paige Miller

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

Including additional independent variables can change the coefficient of weight, unless the additional independent variables are orthogonal to weight (unlikely, unless it is a designed experiment).

 

I suppose you could still do a statistical comparison of the slopes, but I think that's sort of meaningless here, you know the slopes will be different because of the additional two independent variables.

--
Paige Miller
Reeza
Super User
You should look at the adjusted R square as well, as it attempts to account for the number of variables in your model.
M_Maldonado
Barite | Level 11

Shivi,

Are you using Enterprise Miner?

What is the ASE (average squared error) or what does the distribution of the predicted price looks like for each model? Fit stats are really important but actually looking at the predicted dependent variable helps assess if you need extra work (transformations, other methods, ensemble methods, etc).

 

Also, why do you only have one independent variable? This is the big data world 🙂 and you and SAS can handle more. Really curious here....

[Edit: Scratch this question. I just noticed you mentioned you had other 2 variables]

 

Thanks!

Shivi82
Quartz | Level 8

Hi,

I was running stepwise regression so in first step the model had 1 significant variable. Thereafter further significant variables were added.

 

Regards, Shivi

PGStats
Opal | Level 21

The simplest way to test the slopes equality is to join the datasets with 

 

 

proc sql;
create table both as
select 1 as set, *, 0 as var1, 0 as var2 from set1
union all
select 2 as set, * from set2;
quit;

and test the equality of the slopes with 

 

proc glm data=both;
class set;
model price = weight var1 var2 weight*set / solution;
run;

the significance of weight*set gives you the probability that the slopes are the same.

 

(untested)

 

 

PG
Shivi82
Quartz | Level 8

Thank you for the solution i will for sure give it a try and in case of required i would seek your guidance.

 

Regards, Shivi

PGStats
Opal | Level 21

It had not occured to me that set1 and set2 could be the same data. If so, then the slope comparison is pointless, as @PaigeMiller said. If you are comparing different models on the same data, it is the significance of var1 and var2 that matters, not the change in slopes. My proposed analysis is for comparing the slopes on two independent data sets, one in which has two extra variables.

PG
PaigeMiller
Diamond | Level 26

Perhaps I'm belaboring the point, however, let's consider this situation

 

Data set 1: fit model Y=b0 + b1X1 + e

 

Data set 2: fit model Y=b0 + b1X1 + b2X2 + b3X3 + e

 

I would say it still doesn't make any sense to compare b1 from data set 1 to b1 from data set 2. Yes, you probably can get SAS to perform this comparison, but what does it mean? I'd say it means nothing, the fact that there are different models may affect the slope b1 and you can't assume that the different b1 values across the two different models are different just because of random variability in the data sets.

--
Paige Miller
PaigeMiller
Diamond | Level 26

My problem with doing a significance test of the coefficients to determine if they are different is taht the significance test looks for a change in slope over and above what we would expect from random variability in the system; but this change in slope could have nothing to do with random variation, it could be due to the addition of two variables into the model. I prefer the extra-sum-of-squares test to see if the fit of the MODEL (not this particular slope) is really significantly different.

--
Paige Miller
PaigeMiller
Diamond | Level 26

Continuing on the above point ...

 

are you comparing the slopes from two different models fit to the EXACT SAME data set? That's not the purpose of significance testing, significance testing would compare a slope computed from data set A to a slope computed from data set B. If this really is comparing two different slope estimates from two different models fit to the EXACT SAME data set, you are doing something that has no meaning, and hence the results are meaningless.

--
Paige Miller

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 1342 views
  • 3 likes
  • 5 in conversation