Is there any option in Proc Discrim (or another KNN procedure) that can do n-fold cross validation?
Just some quick background. I'm trying to use KNN to classify the fishes in the SASHelp.Fish data set. Below is the code:
data analysis;
set sashelp.fish;
where species in ('Bream', 'Perch');
run;
data train test;
set analysis;
rand = ranuni(100);
if rand <= 0.8 then output train;
else output test;
run;
Above code splits the FISH data set into training and testing data set. In the training set, I want to do n-fold cross-validation to get the optimal k for KNN. See link below:
I can't seem to find the Proc Discrim options that enable me to do this easily.
proc discrim data = train test = test
testout = _score1 method = npar k = 5 testlist crossvalidate crosslist;
class species;
var weight height;
run;
Does anyone know whether this cross-validation feature is available in Proc Discrim (or any other procedure)? If not, what's the better way to find the optimal k for KNN?
The documentation lists three different crossvalidation options.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.