06-11-2015 06:21 PM
Hello - any help you can give will be much appreciated!
We have a dataset where we may not have enough power to detect a statistical difference for up to date status between racial/ethnic groups (sample size for some groups are <400).
The raw dataset has birthdates the child received the immunization. I know how to create a variable (1 or 0) indicating whether this child was up-to-date for a particular vaccine or vaccine series by a certain age (i.e, 3 months for example).
I inherited a SAS program that uses proc surveymeans, the ratio statement and creates a confidence interval for each of the required vaccines at certain age checkpoints. Problem is, since we have overlapping confidence intervals, there may not be enough power to detect a statistical difference.
How do I for example, determine if for example, the percentage of children up-to-date for a particular racial/ethnic group (Blacks/African Americans for example) is statistically different from the percentage of children up-to-date in the group of 'All children' in the sample (i.e, Asians, Blacks, Whites, etc)?
I'm familiar with chi-squared but isn't that only for categorical variables?
Any help you can give is much appreciated! Thanks!
06-12-2015 01:55 PM
I have one proportion for UTD 'Blacks/AFAMs' and one for 'All racial/ethnic groups'. Would the ultimate dataset I conduct the ttest on just have two values? I for 'Black/Afam' and one for 'All racial/ethnic groups'?
The examples I see here (SAS Annotated Output: Proc ttest) have students with different test scores.
Whereas, if I were to look at each student, I would just have a value of 1 (indicating up to date for a vaccine) or 0.
06-12-2015 03:40 PM
Show us the SURVEYMEANS code you are starting with. You don't want to jump directly to proc ttest as the assumptions behind variance are likely not met when using complex survey data (though the procedure is remarkably robust even so).
What you want to do is add some options to the RATIO statement in Surveymeans to request t-statistics.
Or possibly add some additional analysis variables.
06-12-2015 05:19 PM
Ok. It is below.
The dataset is uses has the following below after the code converts the dates the kids received the vaccines to a 1 or 0 status indicating they are up to date by a certain date:
Race codes: 1=Asian, 2=Black, 3=White
student race utd_vax1_3mo utd_vax2_3mo ...utd_series1_3mo
1 1 1 1 1
2 1 0 0 0
3 2 1 1 1
4 2 1 0 1
This dataset is added to a dataset that recodes above dataset with the race variable as 0 for 'all races'.
proc surveymeans data=vax ratio clm nobs;
var utd_vax1_3mo utd_vax2_3mo utd_series1_3mo;
ratio utd_vax1_3mo utd_vax2_3mo utd_series1_3mo / records;
ods output domainratio = output;
So, if I wanted to compare the proportion up-to-date for Black vs All, maybe I can use Chi-sq since the up-to-date status is a dichotomous variable?
06-17-2015 06:10 PM
I looked into the SAS documentation at the proc surveylogistic example for MEPS data and am wondering how they got the point estimates of Black vs. White (reference group), and American Indian vs White:
When I run it using the model statement above and the class statement (class utd_vax1_3mo), I only get a point estimate for race.
06-12-2015 04:01 PM
If you have overlapping confidence intervals you don't have statistically significant difference between the point estimates.
Problem is, since we have overlapping confidence intervals, there may not be enough power to detect a statistical difference.
06-12-2015 05:15 PM
Pretty much if the CI do not overlap then the difference is significant though not a requirement.
input Gender $ Score @@;
f 75 f 76 f 80 f 77 f 80 f 77 f 73
m 82 m 76 m 84 m 85 m 78 m 87 m 82
proc ttest data=scores cochran ci=equal umpu;
The CI do overlap but at an alpha of 0.05 the differences would be considered significant. If there is a "small" overlap there might be a significant difference. How small varies with sample size and underlying distributions.