Hello All,
Looking for a quick way to randomly partition a dataset into three different subsets for model training, testing and validation. I would like to be able to vary the sizes of each set (50% test, 30% training, etc etc) and make sure that the sets are randomly generated.
Thanks!
John
Or you should check proc surveyselect ;
data shoes; set sashelp.shoes; r=ranuni(-1); run; proc rank data=shoes out=have groups=100; var r; ranks rank; run; data test training valid; set have; select; when(rank lt 50) output test; when(rank lt 80) output training; otherwise output valid; end; run;
Xia Keshan
also try to use proc surveryselect
data test training valid;
set have;
if ranuni(2345)<=0.5 then output test;
else if ranuni(2345)>0.5 and ranuni(2345)<=0.8 then output training;
else output valid;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.