Help using Base SAS procedures

Partitioning Data

Reply
Frequent Contributor
Posts: 101

Partitioning Data

Hello All,

Looking for a quick way to randomly partition a dataset into three different subsets for model training, testing and validation.   I would like to be able to vary the sizes of each set (50% test, 30% training, etc etc) and make sure that the sets are randomly generated.

Thanks!

John

Super User
Posts: 10,023

Re: Partitioning Data

Posted in reply to mahler_ji

Or you should check proc surveyselect ;

data shoes;
 set sashelp.shoes;
 r=ranuni(-1);
run;
proc rank data=shoes out=have groups=100;
 var r;
 ranks rank;
run;
data test training valid;
 set have;
 select;
  when(rank lt 50) output test;
  when(rank lt 80) output training;
  otherwise output valid;
 end;
run;



Xia Keshan

Super Contributor
Posts: 275

Re: Partitioning Data

Posted in reply to mahler_ji

also try to use proc surveryselect

Trusted Advisor
Posts: 1,228

Re: Partitioning Data

Posted in reply to mahler_ji

data test training valid;

set have;

if ranuni(2345)<=0.5 then output test;

else if ranuni(2345)>0.5 and ranuni(2345)<=0.8 then output training;

else output valid;   

run;

Ask a Question
Discussion stats
  • 3 replies
  • 196 views
  • 0 likes
  • 4 in conversation