In working on weighted descriptive statistics, I started with PROC SURVEYFREQ as I am interested in analyzing one variable and there upon a subsample of the population within that variable: more specifically, among those who are physically active, how many are White, Black, Hispanic, and Asian. Putting in a 'where' statement gives a note stating: The input data set is subset by WHERE, OBS, or FIRSTOBS. This provides a completely separate analysis of the subset. It does not provide a statistically valid subpopulation or omain analysis, where the total number of units in the subpopulation is not known with certainty. If you want a domain analysis, you should include the domain variables in the TABLES request. I changed the code to a PROC SURVEYMEANS and used the "domain statement" instead of the "where" statement. The variables that I looked into generated the means were "active_cat, insufficient_cat, inactive_cat" which were 1 if the main outcome variable and 0 if not. For example among those who are White non-Hispanic code is provided below: PROC SURVEYMEANS DATA = FINAL; DOMAIN WHITE_MEPS; VAR ACTIVE_CAT INSUFFICIENT_CAT INACTIVE_CAT; WEIGHT "WEIGHT"; RUN; ACTIVE_CAT: 1 is active; 0 is either inactive or insufficient INSUFFICIENT: 1 is insufficient; 0 is either active or inactive INACT: 1 is inactive; 0 is either active or insufficient I ran the same code for DOMAIN Black non-Hispanic, Asian non-Hispanic, and Hispanic. Output is below: WHITE_NH Std Error N Mean of Mean 95% CL for Mean 0 ACTIVE_CAT 31055 0.426972 0.003908 0.41931346 0.43463105 INSUFFICIENT_CAT 31055 0.191087 0.003140 0.18493214 0.19724246 INACTIVE_CAT 31055 0.381940 0.003847 0.37440091 0.38947999 1 ACTIVE_CAT 38705 0.435237 0.003391 0.42859149 0.44188345 INSUFFICIENT_CAT 38705 0.1923 INACTIVE_CAT 38705 0.3724 Is it correct that the bolded text (White_nH = 1) would provide the proper distribution among the subsample?
... View more