BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Chris_LK_87
Quartz | Level 8

Hello, 

 

I have a dataset with millions of individuals.  Each individual can appear more than once. I would like to randomly select a samplesize of n=1000. The only criteria is that I want to select the same individual always and their cases. As shown in data want.

 

data have;
length id $10 Type $10;
input id$ Type$;
datalines;
1 A
1 B
1 D
2 A
2 F
3 L
4 E
4 T
5 H
6 J
;
run;
 

data want;
length id $10 Type $10;
input id$ Type$;
datalines;
1 A
1 B
1 D
3 L
4 E
4 T
6 J
;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @Chris_LK_87,

 

You can use the CLUSTER statement of PROC SURVEYSELECT:

proc surveyselect data=have
method=srs n=4 /* use n=1000 for your real data */
seed=6180339 out=want;
cluster id;
run;

View solution in original post

2 REPLIES 2
PaigeMiller
Diamond | Level 26

If I am understanding you properly (and I'm not sure that I am, your description seems a little incomplete), you want to sample the distinct ID values with replacement to get 1000 ID values. See: https://blogs.sas.com/content/iml/2014/01/29/sample-with-replacement-in-sas.html

 

Then you can select all the observations from these 1000 ID values.

 

 

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hello @Chris_LK_87,

 

You can use the CLUSTER statement of PROC SURVEYSELECT:

proc surveyselect data=have
method=srs n=4 /* use n=1000 for your real data */
seed=6180339 out=want;
cluster id;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 423 views
  • 4 likes
  • 3 in conversation