BookmarkSubscribeRSS Feed
Cherry
Obsidian | Level 7

I have a business scenario where I have to select only 10% of the total observations for each state. for ex:

The final output should have 10 % of the population from each state. Please assist me with the code. I tried using Proc SurveySelect but did not work.

6 REPLIES 6
Haikuo
Onyx | Level 15

You would have to share more details about your problems, such as your code and the log involved. As it seems to me, proc surveyselect works as being expected. Please see the following example:

proc sort data=sashelp.class out=have;

by sex;

run;

Proc SurveySelect data=have out=want samprate=0.1;

strata sex;

run;

proc print;run;

Not sure any special method or algorithm  you are applying will make a difference, so more details will be needed.

Regards,

Haikuo

Cherry
Obsidian | Level 7

Hi,

I have a business scenario where I have to select only 10% of the total observations for each state. for ex:

Statetotal observationsNo of obs to be output (10% of total)
Northeast435434355
Northwest123231233
Southeast87657887658
Southwest38748238749

The final output should have 10 % of the population from each state. Please assist me with the code. I tried using Proc SurveySelect but did not work.

art297
Opal | Level 21

Did you try the code proposed by Hai.Kuo?  Seems like it should work:

proc sort data=yourdatafilelibandname out=have;

  by state;

run;

Proc SurveySelect data=have out=want samprate=0.1;

  strata state;

run;

ballardw
Super User

Cherry wrote:

Hi,

I have a business scenario where I have to select only 10% of the total observations for each state. for ex:

Statetotal observationsNo of obs to be output (10% of total)
Northeast435434355
Northwest123231233
Southeast87657887658
Southwest38748238749

The final output should have 10 % of the population from each state. Please assist me with the code. I tried using Proc SurveySelect but did not work.

HOW did it not work? No output at all? Expected number of observations not selected? What did your code look like?

Tom
Super User Tom
Super User

If you only want approximately 10% then just use a random number generator to give each obs 10% chance of being selected.

data want ;

   set have ;

   if ranuni(0) <= .10 then output;

run;

Proc SurveySelect should work for this problem.  Could you post the code you tried?

FriedEgg
SAS Employee

The following will not take a random sample but it will return every nth record from each by group.

proc sql;

create view class as

select *

from sashelp.cars

order by origin;

quit;

data want;

set class;

by origin;

array accum&sysindex[1] _temporary_; *temp so no need to drop later;

if first.origin then accum&sysindex[1]=10;

accum&sysindex[1]+1;

if accum&sysindex[1]>10 then

  do;

   accum&sysindex[1]=accum&sysindex[1]-10;

   output;

  end;

run;

proc sql;

select a.origin,total,output

   from (select origin,count(1) as total from sashelp.cars group by origin) a

  left join (select origin,count(1) as output from want group by origin) b on a.origin=b.origin;

quit;

Origintotaloutput
Asia15816
Europe12313
USA14715

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1113 views
  • 0 likes
  • 6 in conversation