Help using Base SAS procedures

Best way to use PROC SURVEYSELECT

Reply
Super Contributor
Posts: 398

Best way to use PROC SURVEYSELECT

I'm having a hard time grasping how to use SURVEYSELECT.

I have a data table that I need to take the top 20 rows then a random sample row out of 10 row chunks till my entire dataset has been gone through.

So I have the first 20 then from 21 to 30 I need to randomly pick a row. Then from 31 to 40 I need to randomly pick another row and so on till it goes through my dataset.

Is this possible use SURVEYSELECT?

Any help would be greatly appreciated

Thank You
Respected Advisor
Posts: 3,777

Re: Best way to use PROC SURVEYSELECT

I think you need to stratify the observations into first 20 then by 10s until the EOF. You will also need a SAMPSIZE data set with _NSIZE_ and the strata variable. I think this is what you want.

[pre]
data shoes size(keep=strata _nsize_);
strata = 1;
_nsize_ = 20;
output size;
do i = 1 to _nsize_ until(eof);
link set;
output shoes;
end;
_nsize_ = 1;
do i = 1 by 1 until(eof);
link set;
if mod(i,10) eq 1 then do;
strata + 1;
output size;
end;
output shoes;
end;
stop;
set:
set sashelp.shoes end=eof;
return;
run;
proc print data=size;
run;

proc surveyselect data=shoes sampsize=size;
strata strata;
run;
proc print;
run;
[/pre]
Super Contributor
Posts: 398

Re: Best way to use PROC SURVEYSELECT

_null_,
Thank you so much

I just got out of a meeting where they changed the requirements. Now I have to take the first 20 then the first row out of 10 row chunks.

So first 20, 21,31,41 till the end. Do I still need to use SurveySelect to do this or can I do it a more simple way?

Thank you again for your quick response.
Respected Advisor
Posts: 3,777

Re: Best way to use PROC SURVEYSELECT

You don't need surveyselect if you don't need a random sample.

[pre]
dm 'clear log; clear output;';
data sample;
do i = 1 to 20;
link set;
output;
end;
do i = 21 by 10 while(i le nobs);
link set;
output;
end;
stop;
set:
set sashelp.shoes point=i nobs=nobs;
return;
run;
proc print;
run;
[/pre]
Super Contributor
Posts: 398

Re: Best way to use PROC SURVEYSELECT

_null_,
Thank you for your reply.

I have another issue that just popped up. How can I do this based on a user column?

I have a user column and I have to take the top 20 rows per user and the first row of every 10 for each user.

I do apologize for my newbie questions.

Thanks for all the help
Respected Advisor
Posts: 3,777

Re: Best way to use PROC SURVEYSELECT

The has to read all the obs in the data but if you don't have a huge data set it may be good enough. I you want to just read the selected obs let me know, but that is a little harder.

[pre]
data sample2;
do i=1 by 1 until(last.region);
set sashelp.shoes;
by region;
if 1 le i le 20 or mod(i,10) eq 1 then output;
end;
run;
[/pre]
Super Contributor
Posts: 398

Re: Best way to use PROC SURVEYSELECT

_null_,
Thank you so much for your help. I only had a bit to test it but it seems to be dead on. My full set is only around 1100 rows so it's small enough for this to work good.


Thanks again
Ask a Question
Discussion stats
  • 6 replies
  • 120 views
  • 0 likes
  • 2 in conversation