Hi, i have created the model, but i want to validate it. I want to validation for which i am using the 70% data for sampling & rest 30% of data for validation by using the proc survey select. What is the code for downloading the 30% of data for validation in proc surveyselect.
proc surveyselect data= raw_data method=srs rep=2 samprate=0.7 seed=1234 out=one20 ;
id _all_ ;
run ;
Please specify how to use this.
@PeterClemmensen wrote:
do something like this
proc surveyselect data = sashelp.cars noprint samprate = .7 out = cars_sample seed = 12345 outall; run; data sampling; set cars_sample; where Selected=1; run; data validation; set cars_sample; where Selected=0; run;
shorter
data sampling validation; set cars_sample; if Selected=1 then output sampling; else output validation; run;
do something like this
proc surveyselect data = sashelp.cars noprint
samprate = .7
out = cars_sample
seed = 12345 outall;
run;
data sampling;
set cars_sample;
where Selected=1;
run;
data validation;
set cars_sample;
where Selected=0;
run;
Or
data sampling validation;
set cars_sample;
if selected then output sampling;
else output validation;
drop selected;
run;
@PeterClemmensen wrote:
do something like this
proc surveyselect data = sashelp.cars noprint samprate = .7 out = cars_sample seed = 12345 outall; run; data sampling; set cars_sample; where Selected=1; run; data validation; set cars_sample; where Selected=0; run;
shorter
data sampling validation; set cars_sample; if Selected=1 then output sampling; else output validation; run;
if you have a id variable which has unique value (e.x. NAME) , try this one :
proc surveyselect data=sashelp.class out=training samprate=0.7 seed=12345678;
run;
proc sql;
create table validate as
select *
from sashelp.class
where name not in (select name from training);
quit;
@ankur_1989 wrote:
proc surveyselect data= raw_data method=srs rep=2 samprate=0.7 seed=1234 out=one20 ;
id _all_ ;
run ;
Please specify how to use this.
I would ask why you have the rep=2 in your code. It has a strong likelihood of duplicating some records in the selected sample which will not do what you expect in terms of training/validation sets.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.