Hi,
I'm sure this is an easy question, but I would gladly appreciate the help.
I want to compare the proportion of women through two different phases. Let's say I have the following dataset:
data test;
input wom second_phase wom_2 wtps;
datalines;
1 1 1 3
1 0 . 2
1 1 0 2
0 1 0 2
0 0 . 5
1 1 1 3
1 1 0 4
1 1 . 4
1 0 . 5
0 1 0 1
0 0 . 3
1 1 . 2
1 1 1 2
1 1 0 2
;
run;
Using the weights, I can calculate that the weighted proportion of women in before second phase is equal to 72.5%, whereas the weighted proportion of women among those who qualified for the second phase (second_phase = 1) is equal to 42.1%.
What procedure could I use to be able to conclude that the differences between the weighted proportions of women across the two steps are statistically significant?
Thanks you in advance!
Example using PROC FREQ
Thank you for your quick response.
The problem I have is that the missing values found in the variable wom_2 are excluding the observations in proc freq for both wom and wom_2.
Here is the code I use for the proc freq
proc freq data = test; tables wom * wom_2 / chisq; weight wtps; run;
And here is the output
Table of wom by wom_2 |
|
|
|
|
|
| wom_2 |
| Total |
|
| 0 | 1 |
|
wom |
| 3 | 0 | 3 |
0 | Frequency |
|
|
|
| Percent | 15.79 | 0 | 15.79 |
| Row Pct | 100 | 0 |
|
| Col Pct | 27.27 | 0 |
|
1 | Frequency | 8 | 8 | 16 |
| Percent | 42.11 | 42.11 | 84.21 |
| Row Pct | 50 | 50 |
|
| Col Pct | 72.73 | 100 |
|
|
| 11 | 8 | 19 |
Total | Frequency |
|
|
|
| Percent | 57.89 | 42.11 | 100 |
Frequency Missing = 21 |
|
|
|
|
Thank you
Yes, data with missing values in one of the two category variables cannot be used in this analysis.
@Shawn08 wrote:
Is there any other alternatives I could use?
What do you mean? If you don't have the value for that observation how are going to account for that?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.