BookmarkSubscribeRSS Feed
ngordon_kpdor
Calcite | Level 5

I want to use SAS Proc Surveyreg to produce prevalence for 4 race groups age-standardized to a specific Census year. I've used the model I found in the NHANES tutorial for doing this using data for a single survey cycle or not controlling for survey cycle. However, in this case, I'm pooling data from 3 survey cycles, and I can't find an example to show how to control for survey cycle. NOTE: I don't want to produce age-standardized estimates for each race group for the separate survey years, just a single set of age-standardized estimates  for each race group based on the pooled data but controlling for differences in survey year. Thanks. Nancy

2 REPLIES 2
ballardw
Super User

@ngordon_kpdor wrote:

I want to use SAS Proc Surveyreg to produce prevalence for 4 race groups age-standardized to a specific Census year. I've used the model I found in the NHANES tutorial for doing this using data for a single survey cycle or not controlling for survey cycle. However, in this case, I'm pooling data from 3 survey cycles, and I can't find an example to show how to control for survey cycle. NOTE: I don't want to produce age-standardized estimates for each race group for the separate survey years, just a single set of age-standardized estimates  for each race group based on the pooled data but controlling for differences in survey year. Thanks. Nancy


Controlling for which differences in the survey year? I may be confused by your phrasing.

I think what you want could be done a number of ways. One, and the easiest, would be to pick the population from the middle of the Census data cycle, which makes most sense if you are dealing with 3 consecutive years (not stated in your problem). Another would be combine the Census data for the three years, either an average rate or sum depending on how you are using the reference population.

 

There are some nuance that can depend on your data and the options used. For instance, has your combined survey data had the weights adjusted for combination? If not then the population estimates based on sums of weights could approximately equal three times your actual population total.

ngordon_kpdor
Calcite | Level 5
I guess I was unclear in my ask.
I conducted health surveys of our health plan members in 2011, 2014, and 2017. I pooled the data from these 3 survey cycles in order to have enough Blacks, Latinos, Filipinos, and Chinese who were aged 35-64 at the time of the survey. I created a post-stratification weighting factor for the pooled survey data that weights each racial group to the age distribution for that race group in the health plan in 2016. In order to be able to compare Whites, Blacks, Latinxs, Filipinxs and Chinese, I used Proc Surveyreg (following the NHANES tutorial) to age-standardize the estimated prevalence and to test for differences between Whites and the other race/ethnic groups.
I used the SAS code below to age-standardize. The decimal numbers after age3564gp correspond to the distribution of ages 35-64 (broken into 35-44, 45-54, 55-64 age groups) in the 2016 American Community Survey. I used survyr as my stratum factor. Survwt is my survey weighting factor.
I think I'm probably OK just pooling the data and not controlling for survey year because for most of my health behavior variables, there doesn't seem to be much variation across the survey years. However, for some outcomes, there may be enough of a difference across the 3 survey cycles that it would be good to be able to say I controlled for survey year in the analysis. That's what I'm not sure how to do.
proc surveyreg ; domain sex;
strata survyr;
class race age3564gp;
weight survwt; format sex sex.;
model fvegge3dy= race age3564gp race*age3564gp/noint solution clparm;
estimate "WhiteNH 35-64" race 1 0 0 0 0 0 age3564gp .3248 .3430 .3322
race*age3564gp .3248 .3430 .3322 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 /cl;
estimate "Black 35-64" race 0 1 0 0 0 0 age3564gp .3248 .3430 .3322
race*age3564gp 0 0 0 .3248 .3430 .3322 0 0 0 0 0 0 0 0 0 0 0 0 /cl;
estimate "Latino 35-64" race 0 0 1 0 0 0 age3564gp .3248 .3430 .3322
race*age3564gp 0 0 0 0 0 0 .3248 .3430 .3322 0 0 0 0 0 0 0 0 0 /cl;
estimate "Filipino 35-64" race 0 0 0 1 0 0 age3564gp .3248 .3430 .3322
race*age3564gp 0 0 0 0 0 0 0 0 0 .3248 .3430 .3322 0 0 0 0 0 0 /cl;
estimate "Chinese 35-64" race 0 0 0 0 1 0 age3564gp .3248 .3430 .3322
race*age3564gp 0 0 0 0 0 0 0 0 0 0 0 0 .3248 .3430 .3322 0 0 0 /cl;
estimate "SoAsn 35-64" race 0 0 0 0 0 1 age3564gp .3248 .3430 .3322
race*age3564gp 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 .3248 .3430 .3322 /cl;

estimate "Wh v Black 35-64" race 1 -1 0 0 0 0
race*age3564gp .3248 .3430 .3322 -.3248 -.3430 -.3322 0 0 0 0 0 0 0 0 0 0 0 0/cl;
estimate "Wh v Latino 35-64" race 1 0 -1 0 0 0
race*age3564gp .3248 .3430 .3322 0 0 0 -.3248 -.3430 -.3322 0 0 0 0 0 0 0 0 0/cl;
estimate "Wh v Filipino 35-64" race 1 0 0 -1 0 0
race*age3564gp .3248 .3430 .3322 0 0 0 0 0 0 -.3248 -.3430 -.3322 0 0 0 0 0 0/cl;
estimate "Wh v Chinese 35-64" race 1 0 0 0 -1 0
race*age3564gp .3248 .3430 .3322 0 0 0 0 0 0 0 0 0 -.3248 -.3430 -.3322 0 0 0/cl;
estimate "Wh v SoAsn 35-64" race 1 0 0 0 0 -1
race*age3564gp .3248 .3430 .3322 0 0 0 0 0 0 0 0 0 0 0 0 -.3248 -.3430 -.3322/cl;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 774 views
  • 1 like
  • 2 in conversation