BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ankur_1989
Calcite | Level 5

Hi, i have created the model, but i want to validate it. I want to validation for which i am using the 70% data for sampling & rest 30% of data for validation by using the proc survey select.  What is the code for downloading the 30% of data for validation in proc surveyselect. 

 

proc surveyselect data= raw_data method=srs rep=2 samprate=0.7 seed=1234 out=one20 ;
id _all_ ;
run ;

 

Please specify how to use this. 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@PeterClemmensen wrote:

do something like this

 

proc surveyselect data = sashelp.cars noprint
                  samprate = .7
                  out = cars_sample
                  seed = 12345 outall;
run;

data sampling;
   set cars_sample;
   where Selected=1;
run;

data validation;
   set cars_sample;
   where Selected=0;
run;

 

shorter

data sampling validation;
   set cars_sample;
   if Selected=1 then output sampling;
   else output validation;
run;

View solution in original post

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

do something like this

 

proc surveyselect data = sashelp.cars noprint
                  samprate = .7
                  out = cars_sample
                  seed = 12345 outall;
run;

data sampling;
   set cars_sample;
   where Selected=1;
run;

data validation;
   set cars_sample;
   where Selected=0;
run;
PGStats
Opal | Level 21

Or

 

data sampling validation;
set cars_sample;
if selected then output sampling;
else output validation;
drop selected;
run;
PG
ballardw
Super User

@PeterClemmensen wrote:

do something like this

 

proc surveyselect data = sashelp.cars noprint
                  samprate = .7
                  out = cars_sample
                  seed = 12345 outall;
run;

data sampling;
   set cars_sample;
   where Selected=1;
run;

data validation;
   set cars_sample;
   where Selected=0;
run;

 

shorter

data sampling validation;
   set cars_sample;
   if Selected=1 then output sampling;
   else output validation;
run;
Ksharp
Super User

if you have a id variable which has unique value (e.x. NAME) , try this one :

 

proc surveyselect data=sashelp.class out=training samprate=0.7 seed=12345678;
run;

proc sql;
create table validate as
 select *
  from sashelp.class
   where name not in (select name from training);
quit;
ballardw
Super User

@ankur_1989 wrote:

 

proc surveyselect data= raw_data method=srs rep=2 samprate=0.7 seed=1234 out=one20 ;
id _all_ ;
run ;

 

Please specify how to use this. 


I would ask why you have the rep=2 in your code. It has a strong likelihood of duplicating some records in the selected sample which will not do what you expect in terms of training/validation sets.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1309 views
  • 0 likes
  • 5 in conversation