BookmarkSubscribeRSS Feed
mahler_ji
Obsidian | Level 7

Hello All,

Looking for a quick way to randomly partition a dataset into three different subsets for model training, testing and validation.   I would like to be able to vary the sizes of each set (50% test, 30% training, etc etc) and make sure that the sets are randomly generated.

Thanks!

John

3 REPLIES 3
Ksharp
Super User

Or you should check proc surveyselect ;

data shoes;
 set sashelp.shoes;
 r=ranuni(-1);
run;
proc rank data=shoes out=have groups=100;
 var r;
 ranks rank;
run;
data test training valid;
 set have;
 select;
  when(rank lt 50) output test;
  when(rank lt 80) output training;
  otherwise output valid;
 end;
run;



Xia Keshan

slchen
Lapis Lazuli | Level 10

also try to use proc surveryselect

stat_sas
Ammonite | Level 13

data test training valid;

set have;

if ranuni(2345)<=0.5 then output test;

else if ranuni(2345)>0.5 and ranuni(2345)<=0.8 then output training;

else output valid;   

run;

Catch up on SAS Innovate 2026

Dive into keynotes, announcements and breakthroughs on demand.

Explore Now →
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2104 views
  • 0 likes
  • 4 in conversation