BookmarkSubscribeRSS Feed
kisumsam
Quartz | Level 8

Is there any option in Proc Discrim (or another KNN procedure) that can do n-fold cross validation?

 

Just some quick background. I'm trying to use KNN to classify the fishes in the SASHelp.Fish data set. Below is the code:

 

data analysis;
set sashelp.fish;
where species in ('Bream', 'Perch');
run;

data train test;
set analysis;
rand = ranuni(100);
if rand <= 0.8 then output train;
else output test;
run;

Above code splits the FISH data set into training and testing data set. In the training set, I want to do n-fold cross-validation to get the optimal k for KNN. See link below:

 

https://medium.com/@svanillasun/how-to-deal-with-cross-validation-based-on-knn-algorithm-compute-auc...

 

I can't seem to find the Proc Discrim options that enable me to do this easily.

 

proc discrim data = train test = test
  testout = _score1 method = npar k = 5 testlist crossvalidate crosslist;
  class species;
  var weight height;
run; 

 

Does anyone know whether this cross-validation feature is available in Proc Discrim (or any other procedure)? If not, what's the better way to find the optimal k for KNN?

 

 

1 REPLY 1
PaigeMiller
Diamond | Level 26

The documentation lists three different crossvalidation options.

https://documentation.sas.com/?docsetId=statug&docsetVersion=15.1&docsetTarget=statug_discrim_syntax...

--
Paige Miller

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 719 views
  • 0 likes
  • 2 in conversation