I wonder how your project has been going on right now and whether you still need this piece of information right now. Still, I will provide my advice on the problem you encountered.
@San123 wrote:
HI Xia,
let me clearify further.
i am creating 2 samples. And I am trying to compare the difference in mean age between these 2
samples. i found out that distribution of age is not nominla( as per the histogram from UNIVARIATE).
So ttest doesnt work. I think Npar1way Wilcoxon will work.
I guess you were trying to say that the distribution of age was not normal, which necessitated a complex survey data version of the Wilcoxon sum-of-rank test. Till now, there is no built-in module for nonparametric tests of complex survey data like "PROC SURVEYNPAR1WAY". However, the extensions of Wilcoxon sum-of-rank test to complex survey data has been made available by statisticians, even at as early as the time you raised this question. They are simply obscure (i.e., few people know their presence).
Your intuition that "ttest dosen't work" reflects your good understanding on the basic assumption of the test. However, as Two-sample rank tests under complex sampling | Biometrika | Oxford Academic tells you, the complex survey data version of the Wilcoxon sum-of-rank test actually can be reduced to a test of domain means (i.e., difference in means of the two groups you want to compare against each other) that uses a t-statistic for hypothesis testing. In other words, the complex survey data version of the Wilcoxon sum-of-rank test actually can be reduced to a t-test! Despite being counter-intuitive, the authors of this article provided proof in this paper. So feel safe to use it.
Despite the lack of "PROC SURVEYNPAR1WAY" in SAS, you can easily conduct the very specific test via the SURVEYMEANS procedure, like this:
proc surveymeans data=aaa;
var x;
strata a;
cluster b;
weight c;
domain group/diff;
run;
I made up variable names and dataset name in the code. Replace them by real variable names and dataset name on your own. Watch out for the degrees of freedom that SAS uses to conduct this t-test as the correct degrees of freedom for this test, as mentioned in the paper I cited, equals the number of primary sampling units minus the number of strata. The degrees of freedom SAS uses may be incorrect if your are using replication methods to obtain the variance. So correct that if necessary.
By the way, there is another group of researchers who generalized the Wilcoxon sum-of-rank test to complex survey data in a different way. Extending the Mann‐Whitney‐Wilcoxon rank sum test to survey data for comparing mean ranks - Lin - 2021 - Statistics in Medicine - Wiley Online Library contains their results. I have not read this paper thoroughly so I cannot make such clear clarification as the one I did for the previous research paper.
By the way, there is an even older research paper that extends the Wilcoxon sum-of-rank test to complex survey data in the special condition that the variable to be compared is nominal instead of continuous. So this generalization does not fit your problem. But in case somebody else needs it, I will paste the link here: Extension of the Wilcoxon Rank Sum Test for Complex Sample Survey Data | Journal of the Royal Statistical Society Series C: Applied Statistics | Oxford Academic. Implementation of this method essentially entails building a cumulative odds logistic regression model.
... View more