Help using Base SAS procedures

Splitting data

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

Splitting data

How to split the data into three randoms sets: 60% in a building set, %10 in a test set and 30% in a validation set.


Accepted Solutions
Solution
‎04-28-2018 01:11 PM
Esteemed Advisor
Posts: 5,519

Re: Splitting data

You can use the GROUPS= option in proc surveyselect. You must provide the exact set sizes of calculate them like this:

 

proc sql;
select 
    round(0.6*count(*)) as n1,
    round(0.1*count(*)) as n2,
    count(*) - calculated n1 - calculated n2 as n3
into :n1, :n2, :n3
from sashelp.class;
quit;

proc surveyselect data=sashelp.class groups=(&n1 &n2 &n3) out=samples;
run;

 

 

PG

View solution in original post


All Replies
Solution
‎04-28-2018 01:11 PM
Esteemed Advisor
Posts: 5,519

Re: Splitting data

You can use the GROUPS= option in proc surveyselect. You must provide the exact set sizes of calculate them like this:

 

proc sql;
select 
    round(0.6*count(*)) as n1,
    round(0.1*count(*)) as n2,
    count(*) - calculated n1 - calculated n2 as n3
into :n1, :n2, :n3
from sashelp.class;
quit;

proc surveyselect data=sashelp.class groups=(&n1 &n2 &n3) out=samples;
run;

 

 

PG
New Contributor
Posts: 3

Re: Splitting data

Thank you, that helped.
Super User
Posts: 10,761

Re: Splitting data

data train validate test;
 set sashelp.heart;
 call streaminit(123456789);
 x=rand('table',0.6,0.3,0.1);
 if x=1 then output train;
  else if x=2 then output validate;
   else output test;
drop x;
run;



New Contributor
Posts: 3

Re: Splitting data

Thank you
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 650 views
  • 10 likes
  • 3 in conversation