BookmarkSubscribeRSS Feed
CatPaws
Calcite | Level 5

Hi! I am doing a Chi-Square test to determine if the deaths are equal between two populations by age group. However, I get the warning below, which I'm not surprised about because some of the age groups had 0 deaths or >5. How do I proceed?  

CatPaws_1-1646366801325.png

 

 

CatPaws_0-1646366476870.png

Title "Chi-Square of Deaths by Age";
PROC FREQ data=AgeDeath;
TABLE age*wave / CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
format age agegrp.;
RUN;

5 REPLIES 5
ballardw
Super User

First thing, that is not an error. What that warning is doing is telling the person that requested the analysis that it may not be valid for the intended purpose.

 

How to proceed depends a great deal on why you were running a chi-square with those groups to begin with. Such as what are "wave2" and "wave3" and why are testing age groups among them?

 

 

One approach might involve collapsing age groups so all of the <1 to 9 are in one group, easily done with a different format definition if the agegrp. format is at all how I think it might be defined. Then you would only have one cell with fewer than 5 count though SAS will still provide a warning.

 

Some other definition(s) of the groups might be more appropriate. How/why were the age groups created that way to begin with?

Ksharp
Super User

What is your ERROR information ? There is not error in LOG.

 

data have;
input row col count;
cards;
1 1 0
1 2 1
2 1 0
2 2 2
3 1 1
3 2 3
4 1 3
4 2 7
5 1 78
5 2 113
;

proc freq data=have;
table row*col/CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
weight count/zero;
run;
SteveDenham
Jade | Level 19

In addition to considering the consolidation of some categories, you might want to explore the EXACT option (see the documentation for PROC FREQ for more about this).  That may take a long time computing as your last age category contains over 10 times as many subjects as the first four categories.  Additionally, that kind of imbalance is a great recipe for misleading values of the chi squared statistic, as the EXPECTED part for each cell is determined almost entirely by the ratio in that single category.

 

SteveDenham

Reeza
Super User

Use FISHER exact test instead. 

https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/procstat/procstat_freq_syntax08.htm#procstat....

 

Title "Chi-Square of Deaths by Age";
PROC FREQ data=AgeDeath;
TABLE age*wave / FISHER CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
format age agegrp.;
RUN;

@CatPaws wrote:

Hi! I am doing a Chi-Square test to determine if the deaths are equal between two populations by age group. However, I get the warning below, which I'm not surprised about because some of the age groups had 0 deaths or >5. How do I proceed?  

CatPaws_1-1646366801325.png

 

 

CatPaws_0-1646366476870.png

Title "Chi-Square of Deaths by Age";
PROC FREQ data=AgeDeath;
TABLE age*wave / CHISQ EXPECTED DEVIATION NOROW NOCOL NOPERCENT;
format age agegrp.;
RUN;


 

Doc_Duke
Rhodochrosite | Level 12

It may be that a categorical test is not appropriate to these data. PROC NPAR1WAY seems more appropriate for testing the difference between waves on age.  Though it will work with grouped ages, it would be even better if you have discrete ages.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1745 views
  • 5 likes
  • 6 in conversation