I meant objective function and loss function are often used to describe the same thing 🙂 However, they are different from error function.
Both function are error function (the error) + regularization (e.g., the squared or absolute value of weights; some people call it R1, R2; some people call it Lasso Ridge).
The reason behind this is: if we just minimize the error, we can easily get a model with very big weights, which makes our activation function's slope really deep (a little change in x, causes big change in y); our model would overfit, unstable and sensitive to noise .
Minimizing both error and the weights, makes our neural network less sensitive to noises in data, and more generalizable.
... View more