Question about Assessing Model Performance

WWD · Posted 08-22-2021 11:47 AM

I have a question about material in the following area of study:

Course = AI and Machine Learning Professional

Module= Machine Learning Specialist

Lession = Lesson 6 Model Assessment and Deployment

Subsection 1

Demonstration - Comparing Models across pipelines.

Within this demonstration, the forest (ensemble) model is deemed the champion. At the 1:07 mark of the video, the student sees the Error Plot (in particular the plots of the average squared error). Within this plot, there are three graphs. I'm assuming that two of the plots are for the training and validation data sets. What data is used to generate the third plot?

Thank you,

Bill Donaldson

PeterChristie · Posted 08-24-2021 12:57 PM

Hello Bill - Thanks for your question. There are project settings that govern the behavior of a pipeline. One of these settings involves partitioning the data. There is a partition for train, validation and test data as you suspected. These represent the 3 lines on the graph. The project settings can be modified by clicking on the settings 'sprocket' on the top right of the Model Studio screen. Hope this helps!

Question about Assessing Model Performance

Re: Question about Assessing Model Performance

SAS Training: Just a Click Away