SAS Data Science

Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Viya (Machine Learning), SAS Visual Text Analytics, with point-and-click interfaces or programming
BookmarkSubscribeRSS Feed
NicolasC
Fluorite | Level 6

Hi there

 

I know this topic has already been adressed but it did not fully answer my question. And I do not seem to find the answer on the Miner Help window.

 

I am using Model Comparison node to compare different models. Target is interval, and inputs are a mix of nominal-interval-binary. I see two options to compare my models: Averge Squared Error (AVE) and Mean Squared Error (MSE). I am confused with the terminology. My understanding is that MSE=SSE/DFE, with SSE the Error Sum of Squares and DFE the Error degrees of freedom, with DFE=n-p-1, with n number of obervations and p number of variables used in the model. Is AVE Miner definition of SSE? If so, the 'Average' is confusing as SSE is just a sum of squared differences. I also do not understand why (see pic attached) MSE is calculated for one model only when I choose this option. Also, the models should anyway be ranked the same order as long as DFE is the same right? 

 

Jumping on another topic, MSE comparion is what I aim for lately as I will want to compare models, some runing on the same database as before, some running on the same database but with reduced number of variables p (and obviously same number of observations n). Since I want to account for this change in my assesment of models, MSE does that.

 

And finally, is there a way in Miner to calculate SST (Total Sum of Squares) and SSM (Model Sum of Squares) so that I can myself calculate the coefficient of determination (asuming there is no option to do so directly for R2??)

 

Many thanks

 

Nicolas

Untitled.jpg

 

3 REPLIES 3
NicolasC
Fluorite | Level 6

Following my previous post, it seems that the AVE=Sum of Squared Errors/Divisor for ASE. So AVE is not SSE but SSE divided by a divisor of some kind. Where does this divisor number comes from? I have n=200,000+ and p circa 250 so I am intrigued how to combine those (if this is the way to do it) to get a divisor in the range of 7,000. Many thanks

BrettWujek
SAS Employee

Hey Nicolas - Please see the following post on this. I think it explains the divisor fairly well.

 

https://communities.sas.com/t5/SAS-Data-Mining-and-Machine/Mean-Squared-Error-vs-Average-Squared-Err...

 


Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

NicolasC
Fluorite | Level 6

Thanks for your answer. I had gone through this post before opening a new topic. This post does not address the different questions I mentionned. 

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 3626 views
  • 0 likes
  • 2 in conversation