I am currently taking a very basic look at variable distributions in (publicily accessible) NHANES data - National Health and Nutrition Examination Survey. Since my data contains survey weights, I'm using PROC SURVEYMEANS to generate estimates of means for continuous variables and the corresponding standard errors. Unfortunately I am getting nonsense values for estimates of the standard error (and standard deviation). In an online tutorial page, NHANES provides code and explanation for using PROC SURVEYMEANS to estimate means and standard errors in their weighted data (see step 2 of http://www.cdc.gov/nchs/tutorials/NHANES/NHANESAnalyses/DescriptiveStatistics/Task3b.htm ). As far as I can tell, I have correctly contructed my weight variable and have appropriately specified the STRATA and CLUSTER statements in PROC SURVEYMEANS.
I can't figure out why but I get really high values for standard deviation and really low values for standard error. The means however look correct. I have included my PROC SURVEYMEANS code below. Since this is all publicly accessible data, I have also included my output below.
Thanks for any help.
CODE:
proc surveymeans data=NHANES_DataMerge NOBS MEAN STDERR STD;
strata sdmvstra;
cluster sdmvpsu;
var LBDLDL RIDAGEYR BMXBMI;
weight weight;
ods output Statistics=printdata;
run;
PROC PRINT DATA=work.printdata;
RUN;
OUTPUT:
Obs VarName VarLabel N Mean StdErr StdDev
1 LBDLDL LDL-cholesterol (mg/dL) 5245 127.594611 0.711109 1912657887
2 RIDAGEYR Age at Screening Adjudicated - Recode 5245 52.729637 0.189132 779619697
3 BMXBMI Body Mass Index (kg/m**2) 5245 28.604494 0.128091 417023923
When posting links to sites in this forum make sure that there is no punction at the end as that gets treated as part of the link and hence get "link not found". So place a space before the ")." in the link you posted.
Surveymeans reports Standard Deviation of the SUM when requested using STD.
You might want to look into requesting LCLM UCLM for confidence intervals of the mean to look at dispersion.
When posting links to sites in this forum make sure that there is no punction at the end as that gets treated as part of the link and hence get "link not found". So place a space before the ")." in the link you posted.
Surveymeans reports Standard Deviation of the SUM when requested using STD.
You might want to look into requesting LCLM UCLM for confidence intervals of the mean to look at dispersion.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.