I am anlayzing data of a subgroup of complex, multistage survey: NHANES. In the analytic documentation on the NHANES' site, they describe that SAS 9.1 and 9.2 do not correctly calculate variance since they do not correctly calculate degrees of freedom. It goes on to explain that these versions of SAS to do not account for strata and PSUs with missing data. My question is whether SAS 9.3 has fixed this miscalulation or not. I have included the NHANES analytic notes below if they are helpful. Thanks!! Key Concepts About Degrees of Freedom for Performing Statistical Tests and Calculating Confidence Limits Degrees of Freedom and NHANES Subgroups Estimates are often calculated for various subgroups of interest within the total NHANES population. When the number of first stage sampling units (PSUs) is small, the z-statistic should be replaced by a value from a t-distribution when computing confidence limits for these estimates (see SUDAAN 1995 — ref from NHANES III analytic guidelines). To calculate the correct value for the t-statistic from a t-distribution and a selected level of significance, you must calculate the proper degrees of freedom for the estimate . In addition, it is important to examine the number of degrees of freedom from which a standard error estimate is based. Continuing research on issues related to stability of variance estimates in subdomains of NHANES have been published and show that standard error estimates based on small numbers of paired PSUs (i.e., degrees of freedom) are prone to instability. The reliability of the estimated standard error, as measured by its relative standard error (i.e., (standard error of the standard error of the estimate/standard error of the estimate)*100), is inversely proportional to its degrees of freedom. As the number of degrees of freedom increases, the relative standard error decreases and the reliability of the estimate increases. The NHANES guidelines recommended a relative standard error of at most 30%. This corresponds to at least 12 degrees of freedom. Degrees of freedom are properly calculated by subtracting the number of clusters in the first level of sampling (strata) from the number of clusters in the second level of sampling (PSUs) for each subgroup you are analyzing as shown the in equation below. Equation for Degrees of Freedom deg of freedom = # of PSUs - # of strata Differences in Degrees of Freedom for Subgroups in SUDAAN and SAS Survey Procedures For both SUDAAN and SAS Survey procedures, the degrees of freedom are calculated in the same way when looking at the entire sample population or in subgroups where all strata and PSUs are represented. However, when you analyze data on a subgroup of sample persons who may not be represented in all strata and PSUs (e.g., Mexican Americans), the degrees of freedom provided in the output may differ. For example, SUDAAN will correctly count the number of PSU's and strata with at least one valid observation for each cell of the table being requested. In contrast, SAS 9.1 Survey procedures, such as proc surveymeans, compute the degrees of freedom as the number of clusters (PSUs) in the non-empty strata minus the number of non-empty strata. This means that if your data have empty strata (no persons in the population for either PSU) the number of degrees of freedom will increase. This is incorrect and SAS is currently working on correcting this problem. For more information on methods of correctly calculating degrees of freedom using SAS 9.1 Survey procedures, please see the following two SAS 9.1 Survey procedures macros.
... View more