BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Torben2
Calcite | Level 5

Hello all,

 

I have two questions about dttrain. I am using dltrain to predict time series using LSTM.

There are two options for which I would like to have more information:
- stagnation
- nthreads

 

Stagnation:
SAS help says the following: "specifies the number of iterations completed without improvement before stopping the optimization early. When the validTable parameter is specified, the validation scores are monitored for stagnation."

 

Does iterations mean epochs? If not, what are iterations?

 

What exactly is meant by validation scores? Loss or error or both together?

 

I have tried a few settings, including stagnation = 1. However, even with an increase in validation errors, the training (including validation table) did not stop before the end of the specified maxEpochs.

 

 

nthreads:
Can the runtime of the training be reduced by specifying threads? What is a reasonable number of threads and how can I determine it for my system?

 

Thanks a lot!

 

Many greetings
Torben

1 ACCEPTED SOLUTION

Accepted Solutions
zhongxiuliu
SAS Employee

I meant objective function and loss function are often used to describe the same thing 🙂 However, they are different from error function.

 

Both function are error function (the error) + regularization (e.g., the squared or absolute value of weights; some people call it R1, R2; some people call it Lasso Ridge).

 

The reason behind this is: if we just minimize the error, we can easily get a model with very big weights, which makes our activation function's slope really deep (a little change in x, causes big change in y); our model would overfit, unstable and sensitive to noise .

Minimizing both error and the weights, makes our neural network less sensitive to noises in data, and more generalizable. 

Aurora Peddycord-Liu

View solution in original post

3 REPLIES 3
zhongxiuliu
SAS Employee

1. epochs are group of iterations, representing when all the data has been used in updating weight. 

in stochastic gradient descent, each iteration we calculate derivatives and update weights, using only a small sample of all data. This way, when we have huge data, we can update quick. (For more information, google stochastic gradient descent.)

If we have 500 data, each iteration we sample 50, then it take 10 iterations for all data to be used once. 

this 10 iterations, is an epoch. 

 

2. validation score: the objective function, or loss. (error + regularization)

 

3. nthreads: is how many GPU devices you use for calculation, not related to the algorithm. SAS Viya use parallel computing, so its like how many computers you want to do the calculation. Larger number=> faster. But you can't have it larger than the available GPU your IT gives you.

 

4. stagnation: your understanding is correct. I suspect is keep going because of the objective function value is still going down, though the error stopped going down. 

Aurora Peddycord-Liu
Torben2
Calcite | Level 5

Thank you very much for your answers. You have helped me a lot.

 

I am not sure if I understood the second point correctly. Does it mean that you can choose between objective function OR loss as validation score?
How to understand the expression in the brackets (error + regularization)?

 

Thanks in advance.

Torben

zhongxiuliu
SAS Employee

I meant objective function and loss function are often used to describe the same thing 🙂 However, they are different from error function.

 

Both function are error function (the error) + regularization (e.g., the squared or absolute value of weights; some people call it R1, R2; some people call it Lasso Ridge).

 

The reason behind this is: if we just minimize the error, we can easily get a model with very big weights, which makes our activation function's slope really deep (a little change in x, causes big change in y); our model would overfit, unstable and sensitive to noise .

Minimizing both error and the weights, makes our neural network less sensitive to noises in data, and more generalizable. 

Aurora Peddycord-Liu

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 934 views
  • 7 likes
  • 2 in conversation