Calcite | Level 5

## Chi-Square WARNING

Hi! I am doing a Chi-Square test to determine if the deaths are equal between two populations by age group. However, I get the warning below, which I'm not surprised about because some of the age groups had 0 deaths or >5. How do I proceed?

Title "Chi-Square of Deaths by Age";
PROC FREQ data=AgeDeath;
TABLE age*wave / CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
format age agegrp.;
RUN;

5 REPLIES 5
Super User

## Re: Chi-Square error

First thing, that is not an error. What that warning is doing is telling the person that requested the analysis that it may not be valid for the intended purpose.

How to proceed depends a great deal on why you were running a chi-square with those groups to begin with. Such as what are "wave2" and "wave3" and why are testing age groups among them?

One approach might involve collapsing age groups so all of the <1 to 9 are in one group, easily done with a different format definition if the agegrp. format is at all how I think it might be defined. Then you would only have one cell with fewer than 5 count though SAS will still provide a warning.

Some other definition(s) of the groups might be more appropriate. How/why were the age groups created that way to begin with?

Super User

## Re: Chi-Square error

What is your ERROR information ? There is not error in LOG.

```data have;
input row col count;
cards;
1 1 0
1 2 1
2 1 0
2 2 2
3 1 1
3 2 3
4 1 3
4 2 7
5 1 78
5 2 113
;

proc freq data=have;
table row*col/CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
weight count/zero;
run;```

## Re: Chi-Square WARNING

In addition to considering the consolidation of some categories, you might want to explore the EXACT option (see the documentation for PROC FREQ for more about this).  That may take a long time computing as your last age category contains over 10 times as many subjects as the first four categories.  Additionally, that kind of imbalance is a great recipe for misleading values of the chi squared statistic, as the EXPECTED part for each cell is determined almost entirely by the ratio in that single category.

SteveDenham

Super User

## Re: Chi-Square WARNING

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_freq_syntax08.htm#procstat....

``````Title "Chi-Square of Deaths by Age";
PROC FREQ data=AgeDeath;
TABLE age*wave / FISHER CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
format age agegrp.;
RUN;``````

@CatPaws wrote:

Hi! I am doing a Chi-Square test to determine if the deaths are equal between two populations by age group. However, I get the warning below, which I'm not surprised about because some of the age groups had 0 deaths or >5. How do I proceed?

Title "Chi-Square of Deaths by Age";
PROC FREQ data=AgeDeath;
TABLE age*wave / CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
format age agegrp.;
RUN;

Rhodochrosite | Level 12

## Re: Chi-Square WARNING

It may be that a categorical test is not appropriate to these data. PROC NPAR1WAY seems more appropriate for testing the difference between waves on age.  Though it will work with grouped ages, it would be even better if you have discrete ages.

Discussion stats
• 5 replies
• 1822 views
• 5 likes
• 6 in conversation