turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Data Mining
- /
- Mean Squared Error vs Average Squared Error

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-15-2016 12:19 AM

In E-Miner, I see 2 selections in the Model Comparison node:

- Average Squared Error

- Mean Squared Error

What is the difference?

Searching the HELP does not yield any decent description:

----->8 snip from HELP file ------------------------------

The Selection Statistic choices are as follows:

Default — The default selection uses different statistics based on the type of target variable and whether a profit/loss matrix has been defined.

If a profit/loss matrix is defined for a categorical target, the average profit or average loss is used.

If no profit/loss matrix is defined for a categorical target, the misclassification rate is used.

If the target variable is interval, the average squared error is used.

Akaike's Information Criterion — chooses the model with the smallest Akaike's Information Criterion value.**Average Squared Error** — chooses the model with the smallest average squared error value.**Mean Squared Error** — chooses the model with the smallest mean squared error value.

----->8 snip from HELP file ------------------------------

Accepted Solutions

Solution

07-15-2016
12:21 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-15-2016 12:21 AM

I answered my own question. In a different part of the HELP section is this description:

In linear models, statisticians routinely use the mean squared error (MSE) as the main measure of fit. The MSE is the sum of squared errors (SSE) divided by the degrees of freedom for error. (DFE is the number of cases less the number of weights in the model.) This process yields an unbiased estimate of the population noise variance under the usual assumptions.

For neural networks and decision trees, there is no known unbiased estimator. Furthermore, the DFE is often negative for neural networks. There exist approximations for the effective degrees of freedom, but these are often prohibitively expensive and are based on assumptions that might not hold. Hence, the MSE is not nearly as useful for neural networks as it is for linear models. One common solution is to divide the SSE by the number of cases N, not the DFE. This quantity, SSE/N, is referred to as the average squared error (ASE).

In linear models, statisticians routinely use the mean squared error (MSE) as the main measure of fit. The MSE is the sum of squared errors (SSE) divided by the degrees of freedom for error. (DFE is the number of cases less the number of weights in the model.) This process yields an unbiased estimate of the population noise variance under the usual assumptions.

For neural networks and decision trees, there is no known unbiased estimator. Furthermore, the DFE is often negative for neural networks. There exist approximations for the effective degrees of freedom, but these are often prohibitively expensive and are based on assumptions that might not hold. Hence, the MSE is not nearly as useful for neural networks as it is for linear models. One common solution is to divide the SSE by the number of cases N, not the DFE. This quantity, SSE/N, is referred to as the average squared error (ASE).

All Replies

Solution

07-15-2016
12:21 AM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

07-15-2016 12:21 AM

In linear models, statisticians routinely use the mean squared error (MSE) as the main measure of fit. The MSE is the sum of squared errors (SSE) divided by the degrees of freedom for error. (DFE is the number of cases less the number of weights in the model.) This process yields an unbiased estimate of the population noise variance under the usual assumptions.

For neural networks and decision trees, there is no known unbiased estimator. Furthermore, the DFE is often negative for neural networks. There exist approximations for the effective degrees of freedom, but these are often prohibitively expensive and are based on assumptions that might not hold. Hence, the MSE is not nearly as useful for neural networks as it is for linear models. One common solution is to divide the SSE by the number of cases N, not the DFE. This quantity, SSE/N, is referred to as the average squared error (ASE).