BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
elbarto
Obsidian | Level 7

I have data like the following (but millions of rows, this is just an excerpt):

 

 

DATA have;
  input individual & $char10. firm year num;
  length individual $10;
DATALINES;
100004-77	1001	2004	3
100004-77	1004	2004	3
100004-77	1006	2004	3
100004-77	1008	2004	3
100004-77	1013	2004	3
100004-77	1015	2004	3
119561-14	1003	2006	5
119561-14	1004	2006	5
119561-14	1005	2006	5
119561-14	1008	2006	5
119561-14	1009	2006	5
119561-14	1012	2006	5
119561-14	1014	2006	5
119561-14	1015	2006	5

;
RUN;

I want to randomly select a sample for each individual year group based on the number in num, e.g., for individual=100004-77 and year=2004, I want to pick a random sample of size 3. I tried the following, but it does not work:

 

proc surveyselect data=have out=want 
	seed=1234 method=srs n=num; 
	 strata individual year;
run;

I need to automate this because I have considerably many more rows, so I can't simply create two different datasets based on each individual-year combination then manually enter the sample size. Thank you.

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
/*Make a dataset for this purpose.*/
DATA have;
  input individual & $char10. firm year num;
  length individual $10;
DATALINES;
100004-77	1001	2004	3
100004-77	1004	2004	3
100004-77	1006	2004	3
100004-77	1008	2004	3
100004-77	1013	2004	3
100004-77	1015	2004	3
119561-14	1003	2006	5
119561-14	1004	2006	5
119561-14	1005	2006	5
119561-14	1008	2006	5
119561-14	1009	2006	5
119561-14	1012	2006	5
119561-14	1014	2006	5
119561-14	1015	2006	5
;
RUN;
proc sql;
create table SampleSize as
select distinct individual,year ,num  as SampleSize
 from have;
quit;

proc surveyselect data=have out=want 
	seed=1234 method=srs n=SampleSize; 
	 strata individual year;
run;

View solution in original post

1 REPLY 1
Ksharp
Super User
/*Make a dataset for this purpose.*/
DATA have;
  input individual & $char10. firm year num;
  length individual $10;
DATALINES;
100004-77	1001	2004	3
100004-77	1004	2004	3
100004-77	1006	2004	3
100004-77	1008	2004	3
100004-77	1013	2004	3
100004-77	1015	2004	3
119561-14	1003	2006	5
119561-14	1004	2006	5
119561-14	1005	2006	5
119561-14	1008	2006	5
119561-14	1009	2006	5
119561-14	1012	2006	5
119561-14	1014	2006	5
119561-14	1015	2006	5
;
RUN;
proc sql;
create table SampleSize as
select distinct individual,year ,num  as SampleSize
 from have;
quit;

proc surveyselect data=have out=want 
	seed=1234 method=srs n=SampleSize; 
	 strata individual year;
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 439 views
  • 0 likes
  • 2 in conversation