## Determining Significance

Regular Contributor
Posts: 206

# Determining Significance

We have a dataset where we may not have enough power to detect a statistical difference for up to date status between racial/ethnic groups (sample size for some groups are <400).

The raw dataset has birthdates the child received the immunization.  I know how to create a variable (1 or 0) indicating whether this child was up-to-date for a particular vaccine or vaccine series by a certain age (i.e, 3 months for example).

I inherited a SAS program that uses proc surveymeans, the ratio statement and creates a confidence interval for each of the required vaccines at certain age checkpoints.  Problem is, since we have overlapping confidence intervals, there may not be enough power to detect a statistical difference.

How do I for example, determine if for example, the percentage of children up-to-date for a particular racial/ethnic group (Blacks/African Americans for example)  is statistically different from the percentage of children up-to-date in the group of 'All children' in the sample (i.e, Asians, Blacks, Whites, etc)?

I'm familiar with chi-squared but isn't that only for categorical variables?

Super User
Posts: 13,508

## Re: Determining Significance

You may want to look at T statistics from the Ratio to test whether the ratios are significantly different than 1.

Regular Contributor
Posts: 206

## Re: Determining Significance

dependent T-test since the 'All racial ethnic groups' is dependent on the proportion of 'Blacks/African Americans'?  Thanks!

Regular Contributor
Posts: 206

## Re: Determining Significance

I have one proportion for UTD 'Blacks/AFAMs' and one for 'All racial/ethnic groups'. Would the ultimate dataset I conduct the ttest on just have two values?  I for 'Black/Afam' and one for 'All racial/ethnic groups'?

The examples I see here (SAS Annotated Output: Proc ttest)  have students with different test scores.
Whereas, if I were to look at each student, I would just have a value of 1 (indicating up to date for a vaccine) or 0.

Super User
Posts: 13,508

## Re: Determining Significance

Show us the SURVEYMEANS code you are starting with. You don't want to jump directly to proc ttest as the assumptions behind variance are likely not met when using complex survey data (though the procedure is remarkably robust even so).

What you want to do is add some options to the RATIO statement in Surveymeans to request t-statistics.

Regular Contributor
Posts: 206

## Re: Determining Significance

Ok. It is below.

The dataset is uses has the following below after the code converts the dates the kids received the vaccines to a 1 or 0 status indicating they are up to date by a certain date:

Race codes:  1=Asian, 2=Black, 3=White

student    race   utd_vax1_3mo  utd_vax2_3mo ...utd_series1_3mo

1          1        1            1               1

2          1        0            0               0

3          2        1            1               1

4          2        1            0               1

etc.....

This dataset is added to  a dataset that recodes above dataset with the race variable as 0 for 'all races'.

proc surveymeans data=vax ratio clm nobs;

var utd_vax1_3mo  utd_vax2_3mo  utd_series1_3mo;

domain raceeth;

ratio utd_vax1_3mo utd_vax2_3mo utd_series1_3mo / records;

strata stratum;

weight sampleweight;

ods output domainratio = output;

run;

So, if I wanted to compare the proportion up-to-date for Black vs All, maybe I can use Chi-sq since the up-to-date status is a dichotomous variable?

Super User
Posts: 13,508

## Re: Determining Significance

I would likely investigate surveylogistic with model statements like

Model utd_vax1_3mo = Race;

Regular Contributor
Posts: 206

## Re: Determining Significance

Thanks!

I looked into the SAS documentation at the proc surveylogistic example for MEPS data and am wondering how they got the point estimates of Black vs. White (reference group), and American Indian vs White:

http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_surveylogist...

When I run it using the model statement above and the class statement (class utd_vax1_3mo), I only get a point estimate for race.

Super User
Posts: 23,683

## Re: Determining Significance

I'm assuming you're talking about the Odds Ratio table? That would require the variable(s) to be included in the CLASS statement.

Regular Contributor
Posts: 206

## Re: Determining Significance

Thank you!

class eth / order=internal  ref=first;

proc survey logistic data=threemonths_utd;

strata stratum1;

model 3_vax1 (descending)=race_eth;

weight=sw;

run;

Super User
Posts: 23,683

## Re: Determining Significance

If you have overlapping confidence intervals you don't have statistically significant difference between the point estimates.

```jcis7 wrote:

Problem is, since we have overlapping confidence intervals, there may not be enough power to detect a statistical difference.

```
Super User
Posts: 13,508

## Re: Determining Significance

Pretty much if the CI do not overlap then the difference is significant though not a requirement.

Consider:

data scores;

input Gender \$ Score @@;

datalines;

f 75  f 76  f 80  f 77  f 80  f 77  f 73

m 82  m 76  m 84  m 85  m 78  m 87  m 82

;

run;

proc ttest data=scores cochran ci=equal umpu;

class Gender;

var Score;

run;

The CI do overlap but at an alpha of 0.05 the differences would be considered significant. If there is a "small" overlap there might be a significant difference. How small varies with sample size and underlying distributions.

Discussion stats
• 11 replies
• 481 views
• 0 likes
• 3 in conversation