BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi,

I want to test whether there is a significant difference in proportions between two groups at each level of a 20-level variable. I've been trying to use proc surveyfreq with the code below, but it isn't providing the results for the chisquare test (only frequencies and standard errors for each level of the variable).

proc surveyfreq data=new;
tables group*variable / chisq;
run;

Does anyone have any suggestions for how to do this and ensure that the hypotheses I'm testing are for differences at each level of the 20-level variable between the two groups (ie, so I get a separate p-value for each of the 20 levels)?
Thanks!!
Nicole
5 REPLIES 5
plf515
Lapis Lazuli | Level 10
> Hi,
>
> I want to test whether there is a significant
> difference in proportions between two groups at each
> level of a 20-level variable. I've been trying to
> use proc surveyfreq with the code below, but it isn't
> providing the results for the chisquare test (only
> frequencies and standard errors for each level of the
> variable).
>
> proc surveyfreq data=new;
> tables group*variable / chisq;
> run;
>
> Does anyone have any suggestions for how to do this
> and ensure that the hypotheses I'm testing are for
> differences at each level of the 20-level variable
> between the two groups (ie, so I get a separate
> p-value for each of the 20 levels)?
> Thanks!!
> Nicole

My first question is why you are using SURVEYFREQ when you have no cluster or strata statements, you could just use FREQ.

My second question is whether you could show us some of the log? Any errors or warnings? You ought to get some sort of chisq test with this code

Third, AFAIK you don't get a separate p-value for each of the 20 levels. I know you don't get this with FREQ. If you want something like this, you need a different PROC. Perhaps LOGISTIC or SURVEYLOGISTIC or something.

Finally, what's your sample size? 20 levels is a lot.


HTH

Peter
Doc_Duke
Rhodochrosite | Level 12
Proportions different from what?

If you want to see if the proportions are equal to one another at each level, then you are effectively testing that the proportion in group 1 is equal to 0.50 at each level. You can do that with PROC FREQ (using BINOMIAL option on TABLES statement) and using the 20-level variable as a BY variable.

If one of your levels is "control" and the other 19 are various 'treatments," you could test whether each treatment is different from the control level using the CONTRAST statement in GENMOD.

The tests are asymptotic, so you'll need a fairly large sample size.

Doc Muhlbaier
Duke
deleted_user
Not applicable
Thanks so much for your response. Here are a few points that hopefully clarify what I'm trying to get at:

I have two groups that differ in size as well as along my main covariate of interest (n=386 in Group 1 and n=737 in Group 2). I would like to see if the frequencies of subjects (out of the total number of subjects in each Group) is significantly different between groups at each Level of a 20-level variable. None of the Levels of the 20-level variable are control or treatment groups. So essentially I want to test whether:

Proportion at Level 1 (out of the total n=386 in Group 1) = Proportion at Level 1 (out of the total n=737 in Group 2)
...etc...
Proportion at Level 20 (out of the total n=386 in Group 1) = Proportion at Level 20 (out of the total n=737 in Group 2)

Would PROC LOGISTIC do the trick (ie, code below), by using the Wald Chi Square p-value in the output?

proc logistic data=new;
by level;
class level;
model group=level;
run;

Thanks so much for any suggestions you can provide.
NG
plf515
Lapis Lazuli | Level 10
Hi

You can't have both BY LEVEL and LEVEL in your model statement. That would lead to an error, because you would be stratifying on the variable you are trying to model.

You could do the same thing only without the BY statement, or you could do a PROC FREQ with a BY STATEMENT

HTH

Peter
StatDave
SAS Super FREQ
In PROC LOGISTIC you would actually make your 20-level variable the response, your binary group variable the predictor, and fit a generalized logit model:

proc logistic;
class group / param=ref;
model level = group / link=glogit;
run;

The tests of the GROUP parameter estimates are tests comparing the groups at each level.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 3680 views
  • 0 likes
  • 4 in conversation