Hello All,
Looking for a quick way to randomly partition a dataset into three different subsets for model training, testing and validation. I would like to be able to vary the sizes of each set (50% test, 30% training, etc etc) and make sure that the sets are randomly generated.
Thanks!
John
Or you should check proc surveyselect ;
data shoes; set sashelp.shoes; r=ranuni(-1); run; proc rank data=shoes out=have groups=100; var r; ranks rank; run; data test training valid; set have; select; when(rank lt 50) output test; when(rank lt 80) output training; otherwise output valid; end; run;
Xia Keshan
also try to use proc surveryselect
data test training valid;
set have;
if ranuni(2345)<=0.5 then output test;
else if ranuni(2345)>0.5 and ranuni(2345)<=0.8 then output training;
else output valid;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.