11-24-2014 07:33 AM
Looking for a quick way to randomly partition a dataset into three different subsets for model training, testing and validation. I would like to be able to vary the sizes of each set (50% test, 30% training, etc etc) and make sure that the sets are randomly generated.
11-24-2014 08:12 AM
Or you should check proc surveyselect ;
data shoes; set sashelp.shoes; r=ranuni(-1); run; proc rank data=shoes out=have groups=100; var r; ranks rank; run; data test training valid; set have; select; when(rank lt 50) output test; when(rank lt 80) output training; otherwise output valid; end; run;