turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Analytics
- /
- Stat Procs
- /
- Chi square test for difference in proportions betw...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-14-2009 03:52 PM

Hi,

I want to test whether there is a significant difference in proportions between two groups at each level of a 20-level variable. I've been trying to use proc surveyfreq with the code below, but it isn't providing the results for the chisquare test (only frequencies and standard errors for each level of the variable).

proc surveyfreq data=new;

tables group*variable / chisq;

run;

Does anyone have any suggestions for how to do this and ensure that the hypotheses I'm testing are for differences at each level of the 20-level variable between the two groups (ie, so I get a separate p-value for each of the 20 levels)?

Thanks!!

Nicole

I want to test whether there is a significant difference in proportions between two groups at each level of a 20-level variable. I've been trying to use proc surveyfreq with the code below, but it isn't providing the results for the chisquare test (only frequencies and standard errors for each level of the variable).

proc surveyfreq data=new;

tables group*variable / chisq;

run;

Does anyone have any suggestions for how to do this and ensure that the hypotheses I'm testing are for differences at each level of the 20-level variable between the two groups (ie, so I get a separate p-value for each of the 20 levels)?

Thanks!!

Nicole

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-15-2009 06:46 PM

> Hi,

>

> I want to test whether there is a significant

> difference in proportions between two groups at each

> level of a 20-level variable. I've been trying to

> use proc surveyfreq with the code below, but it isn't

> providing the results for the chisquare test (only

> frequencies and standard errors for each level of the

> variable).

>

> proc surveyfreq data=new;

> tables group*variable / chisq;

> run;

>

> Does anyone have any suggestions for how to do this

> and ensure that the hypotheses I'm testing are for

> differences at each level of the 20-level variable

> between the two groups (ie, so I get a separate

> p-value for each of the 20 levels)?

> Thanks!!

> Nicole

My first question is why you are using SURVEYFREQ when you have no cluster or strata statements, you could just use FREQ.

My second question is whether you could show us some of the log? Any errors or warnings? You ought to get some sort of chisq test with this code

Third, AFAIK you don't get a separate p-value for each of the 20 levels. I know you don't get this with FREQ. If you want something like this, you need a different PROC. Perhaps LOGISTIC or SURVEYLOGISTIC or something.

Finally, what's your sample size? 20 levels is a lot.

HTH

Peter

>

> I want to test whether there is a significant

> difference in proportions between two groups at each

> level of a 20-level variable. I've been trying to

> use proc surveyfreq with the code below, but it isn't

> providing the results for the chisquare test (only

> frequencies and standard errors for each level of the

> variable).

>

> proc surveyfreq data=new;

> tables group*variable / chisq;

> run;

>

> Does anyone have any suggestions for how to do this

> and ensure that the hypotheses I'm testing are for

> differences at each level of the 20-level variable

> between the two groups (ie, so I get a separate

> p-value for each of the 20 levels)?

> Thanks!!

> Nicole

My first question is why you are using SURVEYFREQ when you have no cluster or strata statements, you could just use FREQ.

My second question is whether you could show us some of the log? Any errors or warnings? You ought to get some sort of chisq test with this code

Third, AFAIK you don't get a separate p-value for each of the 20 levels. I know you don't get this with FREQ. If you want something like this, you need a different PROC. Perhaps LOGISTIC or SURVEYLOGISTIC or something.

Finally, what's your sample size? 20 levels is a lot.

HTH

Peter

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-16-2009 06:05 PM

Proportions different from what?

If you want to see if the proportions are equal to one another at each level, then you are effectively testing that the proportion in group 1 is equal to 0.50 at each level. You can do that with PROC FREQ (using BINOMIAL option on TABLES statement) and using the 20-level variable as a BY variable.

If one of your levels is "control" and the other 19 are various 'treatments," you could test whether each treatment is different from the control level using the CONTRAST statement in GENMOD.

The tests are asymptotic, so you'll need a fairly large sample size.

Doc Muhlbaier

Duke

If you want to see if the proportions are equal to one another at each level, then you are effectively testing that the proportion in group 1 is equal to 0.50 at each level. You can do that with PROC FREQ (using BINOMIAL option on TABLES statement) and using the 20-level variable as a BY variable.

If one of your levels is "control" and the other 19 are various 'treatments," you could test whether each treatment is different from the control level using the CONTRAST statement in GENMOD.

The tests are asymptotic, so you'll need a fairly large sample size.

Doc Muhlbaier

Duke

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2009 03:39 PM

Thanks so much for your response. Here are a few points that hopefully clarify what I'm trying to get at:

I have two groups that differ in size as well as along my main covariate of interest (n=386 in Group 1 and n=737 in Group 2). I would like to see if the frequencies of subjects (out of the total number of subjects in each Group) is significantly different between groups at each Level of a 20-level variable. None of the Levels of the 20-level variable are control or treatment groups. So essentially I want to test whether:

Proportion at Level 1 (out of the total n=386 in Group 1) = Proportion at Level 1 (out of the total n=737 in Group 2)

...etc...

Proportion at Level 20 (out of the total n=386 in Group 1) = Proportion at Level 20 (out of the total n=737 in Group 2)

Would PROC LOGISTIC do the trick (ie, code below), by using the Wald Chi Square p-value in the output?

proc logistic data=new;

by level;

class level;

model group=level;

run;

Thanks so much for any suggestions you can provide.

NG

I have two groups that differ in size as well as along my main covariate of interest (n=386 in Group 1 and n=737 in Group 2). I would like to see if the frequencies of subjects (out of the total number of subjects in each Group) is significantly different between groups at each Level of a 20-level variable. None of the Levels of the 20-level variable are control or treatment groups. So essentially I want to test whether:

Proportion at Level 1 (out of the total n=386 in Group 1) = Proportion at Level 1 (out of the total n=737 in Group 2)

...etc...

Proportion at Level 20 (out of the total n=386 in Group 1) = Proportion at Level 20 (out of the total n=737 in Group 2)

Would PROC LOGISTIC do the trick (ie, code below), by using the Wald Chi Square p-value in the output?

proc logistic data=new;

by level;

class level;

model group=level;

run;

Thanks so much for any suggestions you can provide.

NG

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-21-2009 06:18 PM

Hi

You can't have both BY LEVEL and LEVEL in your model statement. That would lead to an error, because you would be stratifying on the variable you are trying to model.

You could do the same thing only without the BY statement, or you could do a PROC FREQ with a BY STATEMENT

HTH

Peter

You can't have both BY LEVEL and LEVEL in your model statement. That would lead to an error, because you would be stratifying on the variable you are trying to model.

You could do the same thing only without the BY statement, or you could do a PROC FREQ with a BY STATEMENT

HTH

Peter

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

01-15-2010 04:04 PM

In PROC LOGISTIC you would actually make your 20-level variable the response, your binary group variable the predictor, and fit a generalized logit model:

proc logistic;

class group / param=ref;

model level = group / link=glogit;

run;

The tests of the GROUP parameter estimates are tests comparing the groups at each level.

proc logistic;

class group / param=ref;

model level = group / link=glogit;

run;

The tests of the GROUP parameter estimates are tests comparing the groups at each level.