I need to subset dataset for 100 households from a large SAS dataset to prepare input for execution of test cases.
In order to identify a household on data file, A household in datafile can be identified as all members of the household will
share the same SERIALNO. All members of each selected household must be included in the subset data. There are 50 variables in the dataset and ten thousands SERIALNO but I needed to subset the dataset based on two variable SERIALNO and Household_member. Each SERIALNO represents one household and also Household_member. I just needed to create a subset of 100 households (SERIALNO) with Household_member included in it with rest of the variables in the dataset.
In the example below, SERIALNO 20161 has household_number 1, 2, 3 and SERIALNO 20162 has 1 household_member and SERIALNO 20164 has household_number 1, 2, 3 and so on and some household_member are up to 15.
Ho do I subset of 100 households with SERIALNO that includes household_members as described below? Please help with the SAS program code to subset this dataset
SERIALNO Household_member
20161 1
20161 2
20161 3
20162 1
20164 1
20164 2
20164 3
Hello @UPRETIGOPI,
I think a random sample using households as sampling units matches your description, except that only variable SERIALNO, but not Household_member, would play a special role in the sampling process.
proc surveyselect data=have
method=srs n=100 seed=2718 out=want;
cluster serialno;
run;
Hello @UPRETIGOPI,
I think a random sample using households as sampling units matches your description, except that only variable SERIALNO, but not Household_member, would play a special role in the sampling process.
proc surveyselect data=have
method=srs n=100 seed=2718 out=want;
cluster serialno;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.