BookmarkSubscribeRSS Feed
mahler_ji
Obsidian | Level 7

Hello All,

Looking for a quick way to randomly partition a dataset into three different subsets for model training, testing and validation.   I would like to be able to vary the sizes of each set (50% test, 30% training, etc etc) and make sure that the sets are randomly generated.

Thanks!

John

3 REPLIES 3
Ksharp
Super User

Or you should check proc surveyselect ;

data shoes;
 set sashelp.shoes;
 r=ranuni(-1);
run;
proc rank data=shoes out=have groups=100;
 var r;
 ranks rank;
run;
data test training valid;
 set have;
 select;
  when(rank lt 50) output test;
  when(rank lt 80) output training;
  otherwise output valid;
 end;
run;



Xia Keshan

slchen
Lapis Lazuli | Level 10

also try to use proc surveryselect

stat_sas
Ammonite | Level 13

data test training valid;

set have;

if ranuni(2345)<=0.5 then output test;

else if ranuni(2345)>0.5 and ranuni(2345)<=0.8 then output training;

else output valid;   

run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1249 views
  • 0 likes
  • 4 in conversation