- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I’m trying to compare sex distributions among different data collection centres using the chi squared (chisq) option in SAS (9.2) but there are different numbers of observations for every data collection centre, which the resulting cross-tabs can’t account for. I’m still a beginner in statistics in SAS, but the only procedure that I know of which allows you to account for unequal sample sizes is proc anova (lines statement) and this is not the right test for this comparison. Would anyone be kind enough to share their insight on this? Thank you!
The following code creates an example of my data.
DATA example;
INFILE datalines;
INPUT centre_A centre_B centre_C centre_D centre_E;
DATALINES;
1 1 0 1 1
0 0 0 1 0
0 0 0 1 0
0 1 1 0 0
1 1 1 1 1
1 1 0 1 .
1 0 0 . .
1 0 . . .
1 1 . . .
0 0 . . .
0 1 . . .
;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming that 0-1 values are about sex and not knowing what individuals in an observation (line) have in common, I would suppose you could run the following Chi-square test to see if the sex ratio varies among centers:
DATA example;
INFILE datalines;
INPUT centre_1 - centre_5;
obs = _n_;
DATALINES;
1 1 0 1 1
0 0 0 1 0
0 0 0 1 0
0 1 1 0 0
1 1 1 1 1
1 1 0 1 .
1 0 0 . .
1 0 . . .
1 1 . . .
0 0 . . .
0 1 . . .
;
proc transpose data=example out=exList;
var centre_1 - centre_5;
by obs;
run;
proc freq data=exList;
table _name_*col1 / chisq;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming that 0-1 values are about sex and not knowing what individuals in an observation (line) have in common, I would suppose you could run the following Chi-square test to see if the sex ratio varies among centers:
DATA example;
INFILE datalines;
INPUT centre_1 - centre_5;
obs = _n_;
DATALINES;
1 1 0 1 1
0 0 0 1 0
0 0 0 1 0
0 1 1 0 0
1 1 1 1 1
1 1 0 1 .
1 0 0 . .
1 0 . . .
1 1 . . .
0 0 . . .
0 1 . . .
;
proc transpose data=example out=exList;
var centre_1 - centre_5;
by obs;
run;
proc freq data=exList;
table _name_*col1 / chisq;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your reply- its exactly what I needed! I do have some final questions about how to interpret the output:
- I imagine that I have to use the results from the Likelihood Ratio Chi-Square rather than the Chi-Square since I’m interested in the ratios, correct? (Although they were really similar)
- The default null hypothesis is be that there is no significant difference between the ratios, right?
- SAS warns me that 63% of my data is missing- this would make sense because of the unequal sample sizes. I wonder if this is a problem as far as the reliability of the results are concerned or how cautious I should be in interpreting the results?
Thanks again for your quick reply, I learned something new which I can apply elsewhere !! 🙂