Hi!
I have trained a model in model studio using a gradient boosting method with a 5-fold cross validation. However, on the results page, on the table that gives a summary of the amount of data used for training/validation, I cannot see the connection between this and my chosen 5-fold.
Here, I chose validation method
And when I look at the results table for the same node, it says that it is divided into approximately 60% and 30% for the training and validation set.
I was wondering what this means? Does the 5-fold cross validation not apply for some reason, or does this mean something else?
Thank you in advance!
Hello @yiyhio ,
This is cross-validation for assessing / selecting the model(s), not for constructing the model(s).
This is what the documentation says:
===========
For small to medium data tables, cross validation provides, on average, a better representation of error across the whole data table. Partition is the default value.
===========
To use k-fold cross-validation for constructing the model, see here:
Cross Validation of a Forest Model
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/casactml/casactml_mltools_example01.htm
The above example uses the crossValidateML action (in PROC CAS).
The crossValidateML Action doc:
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/casactml/casactml_mltools_details02.htm
Kind regards,
Koen
See also my previous response!!
For questions like this, it's better to post in the board :
Analytics > SAS Data Mining and Machine Learning.
More (many more!) of the people in your target audience will read your question (topic).
Kind regards,
Koen
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.