BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ankur_1989
Calcite | Level 5

Hi, i have created the model, but i want to validate it. I want to validation for which i am using the 70% data for sampling & rest 30% of data for validation by using the proc survey select.  What is the code for downloading the 30% of data for validation in proc surveyselect. 

 

proc surveyselect data= raw_data method=srs rep=2 samprate=0.7 seed=1234 out=one20 ;
id _all_ ;
run ;

 

Please specify how to use this. 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@PeterClemmensen wrote:

do something like this

 

proc surveyselect data = sashelp.cars noprint
                  samprate = .7
                  out = cars_sample
                  seed = 12345 outall;
run;

data sampling;
   set cars_sample;
   where Selected=1;
run;

data validation;
   set cars_sample;
   where Selected=0;
run;

 

shorter

data sampling validation;
   set cars_sample;
   if Selected=1 then output sampling;
   else output validation;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

do something like this

 

proc surveyselect data = sashelp.cars noprint
                  samprate = .7
                  out = cars_sample
                  seed = 12345 outall;
run;

data sampling;
   set cars_sample;
   where Selected=1;
run;

data validation;
   set cars_sample;
   where Selected=0;
run;
PGStats
Opal | Level 21

Or

 

data sampling validation;
set cars_sample;
if selected then output sampling;
else output validation;
drop selected;
run;
PG
ballardw
Super User

@PeterClemmensen wrote:

do something like this

 

proc surveyselect data = sashelp.cars noprint
                  samprate = .7
                  out = cars_sample
                  seed = 12345 outall;
run;

data sampling;
   set cars_sample;
   where Selected=1;
run;

data validation;
   set cars_sample;
   where Selected=0;
run;

 

shorter

data sampling validation;
   set cars_sample;
   if Selected=1 then output sampling;
   else output validation;
run;
Ksharp
Super User

if you have a id variable which has unique value (e.x. NAME) , try this one :

 

proc surveyselect data=sashelp.class out=training samprate=0.7 seed=12345678;
run;

proc sql;
create table validate as
 select *
  from sashelp.class
   where name not in (select name from training);
quit;
ballardw
Super User

@ankur_1989 wrote:

 

proc surveyselect data= raw_data method=srs rep=2 samprate=0.7 seed=1234 out=one20 ;
id _all_ ;
run ;

 

Please specify how to use this. 


I would ask why you have the rep=2 in your code. It has a strong likelihood of duplicating some records in the selected sample which will not do what you expect in terms of training/validation sets.