BookmarkSubscribeRSS Feed
yiyhio
Fluorite | Level 6

Hi!

 

I have trained a model in model studio using a gradient boosting method with a 5-fold cross validation. However, on the results page, on the table that gives a summary of the amount of data used for training/validation, I cannot see the connection between this and my chosen 5-fold.

 

Here, I chose validation method

yiyhio_0-1627630591435.png

 

And when I look at the results table for the same node, it says that it is divided into approximately 60% and 30% for the training and validation set.

yiyhio_1-1627630604589.png

 

I was wondering what this means? Does the 5-fold cross validation not apply for some reason, or does this mean something else?

Thank you in advance!

2 REPLIES 2
sbxkoenk
SAS Super FREQ

Hello @yiyhio ,

 

This is cross-validation for assessing / selecting the model(s), not for constructing the model(s).

 

This is what the documentation says:

===========

 

  • Validation method — Specifies how to partition the data for assessing the models. Note that if your data is partitioned, then that partition is used and Validation method, Validation data proportion, and Cross validation number of folds are all ignored. Here are the possible values:
    • Partition — Specifies using a single partition of a training set. With partition, you specify proportions to use for randomly assigning observations to each role.
    • K-fold cross validation — Specifies using the k-fold cross validation method. In k-fold cross validation, each model evaluation requires k training executions (on k-1 data folds) and k scoring executions (on one holdout fold). This increases the evaluation time by approximately a factor of k.

    For small to medium data tables, cross validation provides, on average, a better representation of error across the whole data table. Partition is the default value.

 

===========

To use k-fold cross-validation for constructing the model, see here:

Cross Validation of a Forest Model
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/casactml/casactml_mltools_example01.htm

The above example uses the crossValidateML action (in PROC CAS).
The crossValidateML Action doc:
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/casactml/casactml_mltools_details02.htm

 

Kind regards,

Koen

 

sbxkoenk
SAS Super FREQ

See also my previous response!!

 

For questions like this, it's better to post in the board :

Analytics > SAS Data Mining and Machine Learning.

 

More (many more!) of the people in your target audience will read your question (topic).

 

Kind regards,

Koen

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 1549 views
  • 3 likes
  • 2 in conversation