SAS Data Science

Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Viya (Machine Learning), SAS Visual Text Analytics, with point-and-click interfaces or programming
BookmarkSubscribeRSS Feed
kisumsam
Quartz | Level 8

Is there any option in Proc Discrim (or another KNN procedure) that can do n-fold cross validation?

 

Just some quick background. I'm trying to use KNN to classify the fishes in the SASHelp.Fish data set. Below is the code:

 

data analysis;
set sashelp.fish;
where species in ('Bream', 'Perch');
run;

data train test;
set analysis;
rand = ranuni(100);
if rand <= 0.8 then output train;
else output test;
run;

Above code splits the FISH data set into training and testing data set. In the training set, I want to do n-fold cross-validation to get the optimal k for KNN. See link below:

 

https://medium.com/@svanillasun/how-to-deal-with-cross-validation-based-on-knn-algorithm-compute-auc...

 

I can't seem to find the Proc Discrim options that enable me to do this easily.

 

proc discrim data = train test = test
  testout = _score1 method = npar k = 5 testlist crossvalidate crosslist;
  class species;
  var weight height;
run; 

 

Does anyone know whether this cross-validation feature is available in Proc Discrim (or any other procedure)? If not, what's the better way to find the optimal k for KNN?

 

 

1 REPLY 1
PaigeMiller
Diamond | Level 26

The documentation lists three different crossvalidation options.

https://documentation.sas.com/?docsetId=statug&docsetVersion=15.1&docsetTarget=statug_discrim_syntax...

--
Paige Miller

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 1031 views
  • 0 likes
  • 2 in conversation