I am analyzing pre and post surveys, the survey responses are continuous (1 - 5). I want to know if there is a difference in the average from pre and post results. The sample size is small so I need a non-parametric test for dependent samples. I want to use wilcoxon signed rank but there are unequal sample sizes, there are 12 pre tests and 10 post tests and every example I have seen for the wilcoxon signed rank has equal sample sizes. Is this possible to do?
Since the responses are anonymous, you have to treat this as unmatched data. I suggest you do the following:
1. Structure your data in long form so that you have a grouping variable (values "Pre" or "Post").
2. Use PROC NPAR1WAY with the WILCOXON option to perform the test that compares the locations for the two sets of scores. (You can also use the ANOVA option, for comparison. You'll probably get a similar conclusion.)
Here is an example so that you can see the structure of the data and the PROC NPAR1WAY statements:
/* simulate data. N_Pre=12 and N_Post=10.
data PrePost;
call streaminit(123);
Group = "Pre ";
do i = 1 to 12;
Score = rand("Table", 0.2, 0.4, 0.3, 0.1);
output;
end;
Group = "Post";
do i = 1 to 10;
Score = rand("Table", 0.1, 0.3, 0.3, 0.2, 0.1);
output;
end;
run;
proc print; run;
proc npar1way data=PrePost WILCOXON ANOVA;
class Group; /* levels are "Pre" and "post" */
var Score;
run;
Well, the results will only use the complete cases, so two participants will have data dropped from the analysis. So if you run the analysis, you will need to report that a certain number of participants were dropped and whether that has serious implications for the findings.
https://support.sas.com/documentation/onlinedoc/stat/142/npar1way.pdf
Please explain how you have unequal sample size for the pre- and post- tests. Did two students not take the post tests? If so, then you have only 10 observations, not 12. Can you explain what happened to the missing two tests?
Thanks for your response! Yes, 2 students did not take the post test. However, because they are anon I do not know who did not take the post test so I don't know how to decide which osbervations to drop.
So does this mean you do not have matched data? If you cannot match a subject's first data point to their second, then a paired test is not possible.
Thanks for responding!
I wouldn't call it paired but can the samples be considered independent when they're related? In that case, does wilcoxon rank allow for independent sample of different size?
Since the responses are anonymous, you have to treat this as unmatched data. I suggest you do the following:
1. Structure your data in long form so that you have a grouping variable (values "Pre" or "Post").
2. Use PROC NPAR1WAY with the WILCOXON option to perform the test that compares the locations for the two sets of scores. (You can also use the ANOVA option, for comparison. You'll probably get a similar conclusion.)
Here is an example so that you can see the structure of the data and the PROC NPAR1WAY statements:
/* simulate data. N_Pre=12 and N_Post=10.
data PrePost;
call streaminit(123);
Group = "Pre ";
do i = 1 to 12;
Score = rand("Table", 0.2, 0.4, 0.3, 0.1);
output;
end;
Group = "Post";
do i = 1 to 10;
Score = rand("Table", 0.1, 0.3, 0.3, 0.2, 0.1);
output;
end;
run;
proc print; run;
proc npar1way data=PrePost WILCOXON ANOVA;
class Group; /* levels are "Pre" and "post" */
var Score;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.