BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JanetSultana
Fluorite | Level 6

Hi,

 

I’m trying to compare sex distributions among different data collection centres using the chi squared (chisq) option in SAS (9.2) but there are different numbers of observations for every data collection centre, which the resulting cross-tabs can’t account for.  I’m still a beginner in statistics in SAS, but the only procedure that I know of which allows you to account for unequal sample sizes is proc anova (lines statement) and this is not the right test for this comparison. Would anyone be kind enough to share their insight on this?  Thank you!

 

The following code creates an example of my data.

 

DATA example;

               INFILE datalines;

               INPUT centre_A centre_B centre_C centre_D centre_E;

               DATALINES;

1 1 0 1 1

0 0 0 1 0

0 0 0 1 0

0 1 1 0 0

1 1 1 1 1

1 1 0 1 .

1 0 0 . .

1 0 . . .

1 1 . . .

0 0 . . .

0 1 . . .

;

run;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Assuming that 0-1 values are about sex and not knowing what individuals in an observation (line) have in common, I would suppose you could run the following Chi-square test to see if the sex ratio varies among centers:

 

DATA example;
INFILE datalines;
INPUT centre_1 - centre_5;
obs = _n_;
DATALINES;
1 1 0 1 1
0 0 0 1 0
0 0 0 1 0
0 1 1 0 0
1 1 1 1 1
1 1 0 1 .
1 0 0 . .
1 0 . . .
1 1 . . .
0 0 . . .
0 1 . . .
;

proc transpose data=example out=exList;
var centre_1 - centre_5;
by obs;
run;

proc freq data=exList;
table _name_*col1 / chisq;
run;
PG

View solution in original post

2 REPLIES 2
PGStats
Opal | Level 21

Assuming that 0-1 values are about sex and not knowing what individuals in an observation (line) have in common, I would suppose you could run the following Chi-square test to see if the sex ratio varies among centers:

 

DATA example;
INFILE datalines;
INPUT centre_1 - centre_5;
obs = _n_;
DATALINES;
1 1 0 1 1
0 0 0 1 0
0 0 0 1 0
0 1 1 0 0
1 1 1 1 1
1 1 0 1 .
1 0 0 . .
1 0 . . .
1 1 . . .
0 0 . . .
0 1 . . .
;

proc transpose data=example out=exList;
var centre_1 - centre_5;
by obs;
run;

proc freq data=exList;
table _name_*col1 / chisq;
run;
PG
JanetSultana
Fluorite | Level 6

Thank you very much for your reply- its exactly what I needed! I do have some final questions about how to interpret the output:

 

  1. I imagine that I have to use the results from the Likelihood Ratio Chi-Square rather than the Chi-Square since I’m interested in the ratios, correct? (Although they were really similar)
  2. The default null hypothesis is be that there is no significant difference between the ratios, right?
  3. SAS warns me that 63% of my data is missing- this would make sense because of the unequal sample sizes. I wonder if this is a problem as far as the reliability of the results are concerned or how cautious I should be in interpreting the results?

Thanks again for your quick reply, I learned something new which I can apply elsewhere !! 🙂  

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 3881 views
  • 0 likes
  • 2 in conversation