Hi Tor - Thanks for testing this out and providing the feedback.
First some comments on the autotuning process in general. This initial implementation of autotuning only supported sequential evaluation (training) of each candidate model, which is why the runtimes are so long for large data sets. It is trying to train many models, some of them expensive configurations (e.e., many trees in a forest, large neural nets) - and it is training them sequentially right now. Rest assured we have some significant enhancements coming that will support training multiple candidate models in parallel across the compute resources under control of the Viya distributed execution engine. So if you are running your GA with population size 10 it will train all 10 in parallel instead of sequentially (thus a 10x speedup!...well, assuming you have compute nodes to distribute to).
That being said, let's get to your specific questions/suggestions:
1) Lack of progress indication is a big sore point with many of us right now...we do plan to enhance this for sure. There are 2 aspects to this really - (a) just seeing that it is making progress, and (b) getting a sense for how much longer it will run.
For (a), unfortunately SAS Studio only flushes the log output at the end of a proc run in this release...that is being worked on. There is intermediate progress information as far as printing out all of the candidate model info available from the underlying action - we are working on exposing that through the proc. Overall, I think you should see more progress/status info in the upcoming releases.
As for (b), this is challenging as I'm sure you realize. Take, for example, one batch of candidate models that we are training, all with different combinations of hyperparameter values. First, these might be running in parallel - or some might be held up in a queue waiting for resources. But even assuming unlimited available compute resources, each of these models will potentially take a significantly different amount of time to train (e.g. a 500 tree forest vs a 50 tree forest, or a neural net with hidden layers with 20 neurons vs 200 neurons). So each modeling algorithm would have to provide an estimate of its computational time based on the hyperparameter values at hand. Certainly possible, but not on our radar right now. The best we could do I think is track it based on # complete vs #total (which would be an estimate).
Either way - I completely get your point. This thing is running...we know it's going to take a while, and right now it doesn't give you any sense of whether it's making progress or how much longer it has to go. We'll continue to improve this user experience.
2) SAS Studio does currently go modal (ie locks you out) when you run a program/task. For now what I suggest when running anything that you expect to take a while is to submit it in batch. You need to save your program to a file and then in the navigation pane on the left select "Server Files and Folders" and find your program, right-click and select "Background Submit". You will notice a message pop up in the lower right saying it was submitted, and then you can check the status of it at any point by selecting "Background Job Status" under the "More application options" menu (button next to the "?" button in the upper right).
Hope this info helps you.
Keep plugging away and keep the feedback coming.
Brett
... View more