Hello All,
Looking for a quick way to randomly partition a dataset into three different subsets for model training, testing and validation. I would like to be able to vary the sizes of each set (50% test, 30% training, etc etc) and make sure that the sets are randomly generated.
Thanks!
John
Or you should check proc surveyselect ;
data shoes; set sashelp.shoes; r=ranuni(-1); run; proc rank data=shoes out=have groups=100; var r; ranks rank; run; data test training valid; set have; select; when(rank lt 50) output test; when(rank lt 80) output training; otherwise output valid; end; run;
Xia Keshan
also try to use proc surveryselect
data test training valid;
set have;
if ranuni(2345)<=0.5 then output test;
else if ranuni(2345)>0.5 and ranuni(2345)<=0.8 then output training;
else output valid;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: