I need to subset dataset for 100 households from a large SAS dataset to prepare input for execution of test cases.
In order to identify a household on data file, A household in datafile can be identified as all members of the household will
share the same SERIALNO. All members of each selected household must be included in the subset data. There are 50 variables in the dataset and ten thousands SERIALNO but I needed to subset the dataset based on two variable SERIALNO and Household_member. Each SERIALNO represents one household and also Household_member. I just needed to create a subset of 100 households (SERIALNO) with Household_member included in it with rest of the variables in the dataset.
In the example below, SERIALNO 20161 has household_number 1, 2, 3 and SERIALNO 20162 has 1 household_member and SERIALNO 20164 has household_number 1, 2, 3 and so on and some household_member are up to 15.
Ho do I subset of 100 households with SERIALNO that includes household_members as described below? Please help with the SAS program code to subset this dataset
SERIALNO Household_member
20161 1
20161 2
20161 3
20162 1
20164 1
20164 2
20164 3
Hello @UPRETIGOPI,
I think a random sample using households as sampling units matches your description, except that only variable SERIALNO, but not Household_member, would play a special role in the sampling process.
proc surveyselect data=have
method=srs n=100 seed=2718 out=want;
cluster serialno;
run;
Hello @UPRETIGOPI,
I think a random sample using households as sampling units matches your description, except that only variable SERIALNO, but not Household_member, would play a special role in the sampling process.
proc surveyselect data=have
method=srs n=100 seed=2718 out=want;
cluster serialno;
run;
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.