Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- SAS Procedures
- /
- Determining Significance

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

06-11-2015 06:21 PM

Hello - any help you can give will be much appreciated!

We have a dataset where we may not have enough power to detect a statistical difference for up to date status between racial/ethnic groups (sample size for some groups are <400).

The raw dataset has birthdates the child received the immunization. I know how to create a variable (1 or 0) indicating whether this child was up-to-date for a particular vaccine or vaccine series by a certain age (i.e, 3 months for example).

I inherited a SAS program that uses proc surveymeans, the ratio statement and creates a confidence interval for each of the required vaccines at certain age checkpoints. Problem is, since we have overlapping confidence intervals, there may not be enough power to detect a statistical difference.

How do I for example, determine if for example, the percentage of children up-to-date for a particular racial/ethnic group (Blacks/African Americans for example) is statistically different from the percentage of children up-to-date in the group of 'All children' in the sample (i.e, Asians, Blacks, Whites, etc)?

I'm familiar with chi-squared but isn't that only for categorical variables?

Any help you can give is much appreciated! Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jcis7

06-11-2015 06:42 PM

You may want to look at T statistics from the Ratio to test whether the ratios are significantly different than 1.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

06-12-2015 11:10 AM

dependent T-test since the 'All racial ethnic groups' is dependent on the proportion of 'Blacks/African Americans'? Thanks!

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

06-12-2015 01:55 PM

I have one proportion for UTD 'Blacks/AFAMs' and one for 'All racial/ethnic groups'. Would the ultimate dataset I conduct the ttest on just have two values? I for 'Black/Afam' and one for 'All racial/ethnic groups'?

The examples I see here (SAS Annotated Output: Proc ttest) have students with different test scores.

Whereas, if I were to look at each student, I would just have a value of 1 (indicating up to date for a vaccine) or 0.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jcis7

06-12-2015 03:40 PM

Show us the SURVEYMEANS code you are starting with. You don't want to jump directly to proc ttest as the assumptions behind variance are likely not met when using complex survey data (though the procedure is remarkably robust even so).

What you want to do is add some options to the RATIO statement in Surveymeans to request t-statistics.

Or possibly add some additional analysis variables.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

06-12-2015 05:19 PM

Ok. It is below.

The dataset is uses has the following below after the code converts the dates the kids received the vaccines to a 1 or 0 status indicating they are up to date by a certain date:

Race codes: 1=Asian, 2=Black, 3=White

student race utd_vax1_3mo utd_vax2_3mo ...utd_series1_3mo

1 1 1 1 1

2 1 0 0 0

3 2 1 1 1

4 2 1 0 1

etc.....

This dataset is added to a dataset that recodes above dataset with the race variable as 0 for 'all races'.

proc surveymeans data=vax ratio clm nobs;

var utd_vax1_3mo utd_vax2_3mo utd_series1_3mo;

domain raceeth;

ratio utd_vax1_3mo utd_vax2_3mo utd_series1_3mo / records;

strata stratum;

weight sampleweight;

ods output domainratio = output;

run;

So, if I wanted to compare the proportion up-to-date for Black vs All, maybe I can use Chi-sq since the up-to-date status is a dichotomous variable?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jcis7

06-12-2015 05:32 PM

I would likely investigate surveylogistic with model statements like

Model utd_vax1_3mo = Race;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

06-17-2015 06:10 PM

Thanks!

I looked into the SAS documentation at the proc surveylogistic example for MEPS data and am wondering how they got the point estimates of Black vs. White (reference group), and American Indian vs White:

When I run it using the model statement above and the class statement (class utd_vax1_3mo), I only get a point estimate for race.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jcis7

06-17-2015 06:32 PM

Can you post your code?

I'm assuming you're talking about the Odds Ratio table? That would require the variable(s) to be included in the CLASS statement.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

06-25-2015 05:48 PM

Thank you!

class eth / order=internal ref=first;

proc survey logistic data=threemonths_utd;

strata stratum1;

model 3_vax1 (descending)=race_eth;

weight=sw;

run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to jcis7

06-12-2015 04:01 PM

If you have overlapping confidence intervals you don't have statistically significant difference between the point estimates.

jcis7 wrote:

Problem is, since we have overlapping confidence intervals, there may not be enough power to detect a statistical difference.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Reeza

06-12-2015 05:15 PM

Pretty much if the CI do not overlap then the difference is significant though not a requirement.

Consider:

data scores;

input Gender $ Score @@;

datalines;

f 75 f 76 f 80 f 77 f 80 f 77 f 73

m 82 m 76 m 84 m 85 m 78 m 87 m 82

;

run;

proc ttest data=scores cochran ci=equal umpu;

class Gender;

var Score;

run;

The CI do overlap but at an alpha of 0.05 the differences would be considered significant. If there is a "small" overlap there might be a significant difference. How small varies with sample size and underlying distributions.