Solved: Re: Wilcoxon rank sum test in SAS, how the expected sum and standard d...

xiaoshuxu · Posted 10-13-2023 12:37 PM

Hi:

I would like to find documemtation in SAS, how the 'expected sum' and 'StdDev of sum' in SAS output are calculated. I want the formula. I cannot find it online so far.

Background information: I am doing Multiple Imputation for my study. After getting analysis results (Wilcoxon rank sum test on each imputed data set), how should I combine to get a 'whole' p-value?

My idea so far: get a mean of 'sum of scores' for m imputed data sets. Then calculate a 'z' on my own (need the formula here). then get the p-value just by normal distribution, initially, maybe by t distribution too.

Any other formal/standard way to combine the results? It seems to me that the proc mianalyse in SAS is not applicable to my case?

I am eagerly waiting for your expertise on this.

Thanks.

Xiaoshu

Season · Posted 10-15-2023 09:24 AM

Hello, Dave. Despite my continuous effort on the very specific issue of pooling Wilcoxon test results in the past month, I found joining the conversation here still fruitful. It suddenly dawned upon me that the methods I mentioned may be too complicated, a small modification of your method may be a good choice. Still, I have some issues regarding to your code.

(1) Combine sum-of-rank or the z-statistic? In your code, the variable you pooled via PROC MIANALYZE was sum-of-rank, which may violate the rationale of Rubin's rule of pooling estimands, since
Rubin's rule was based upon asymptotic normal distribution of the pooled estimand. In Wilcoxon sum-of-rank test, it is the z-statistic rather than the sum of ranks that follow an asymptotic normal distribution. Therefore, we should pool the z-statistics instead.

(2) Potential necessity to specify the EDF= option in PROC MIANALYZE. I wonder if you forgot to specify the EDF= option to override the infinite degrees of freedom defaulted by PROC MIANALYZE.

So, in conclusion, I think the most convenient way of pooling results of Wilcoxon sum-of-rank tests is as follows: (1) Obtain the z-statistic of each imputed sample; (2) Pool them via PROC MIANALYZE; (3) Obtain the results.

The rationale is as follows: now that the z-statistic correspond to the departure from null hypothesis in each sample and that a z-statistic of 0 stands for not rejecting the null hypothesis. Pooling the Wilcoxon test results translates into a one-sample t-test problem. That is: we have M sample values of a certain statistic (in this case it is the z-statistic) following an asymptotic normal distribution, we would like to see if the population mean of the statistic is 0. The pooling of imputed sample z-statistics is no different than pooling imputed sample means or standard deviations in multiple imputation, which can be easily done in PROC MIANALYZE.

The biggest challenge in doing so is to ascertain the standard error of each z-statistic, which is required by PROC MIANALYZE. I had no idea how to compute it in the first place given that we only have one z-statistic each sample, so it would be impossible to compute neither the sample standard deviation nor the sample standard error. But Licht's work enlightened me by pointing out that the z-statistics generated from Wilcoxon sum-or-rank tests essentially follow a standard normal distribution. In the case of pooling the z-statistics, each sample only computes one z-statistic, so all of the population standard errors of the z-statistics are 1/sqrt(1)=1. That problem was solved! We can simply apply a code instructing SAS to add a row of all 1s and use this row as the standard errors.

Now we finally discuss the EDF= issue. Admittedly, I have not read any literature introducing the concept of effective degrees of freedom in multiple imputation aside from those pertaining to SAS, and I found the explanation SAS Help provided still not that clear. So I also wonder the exact definition of EDF and whether we should specify this option here. From my view, I think it unnecessary to specify the EDF= option here, given that the EDF= option stands for the degrees of freedom of each and every statistic combined. Now that (1) the z-statistics follow a standard normal distribution to which the concept of degrees of freedom does not apply and (2) the t distribution is also asymptotically standard normal, perhaps we can deem each z-statistic as having infinite degrees of freedom, which is the default of PROC MIANALYZE. There is therefore no need to correct the effective degrees of freedom to a finite value.

View solution in original post

StatDave · Posted 10-13-2023 04:28 PM

The formulas are in the Details section of the NPAR1WAY documentation. See the sections titled "Simple Linear Rank Tests for Two-Sample Data". The statistic is the sum of scores minus its expectation plus (by default) a continuity correction. The necessary values are available in the the ODS Table named WilcoxonScores - it contains the sums of scores, the expected sums, and the standard deviations. You can use those to compute the statistic and then provide the variables containing the statistic and the standard deviations to PROC MIANALYZE. For example, assuming that there are two treatments (1 and 2) and that treatment 1 is either has the smaller number of observations or is the first in the data set (if the treatments have number), then the following extracts the data, computes the statistic (as described in the documentation), and combines the imputations.

ods select none;
proc npar1way wilcoxon data=imputed_data;
   by _imputation_;
   class trt; var y;
   ods output wilcoxonscores=wscore(where=(class='1'));
   run;
ods select all;
data wstat; set wscore;
   w=(sumofscores-expectedsum);
   estimate=w + (w<0)*.5 - (w>0)*.5; *apply continuity correction;
   z=estimate/stddevofsum; *recompute the z statistic to check;
   run;
proc mianalyze data=wstat;
   modeleffects Estimate;
   stderr stddevofsum;
   run;

xiaoshuxu · Posted 10-13-2023 07:03 PM

Hi StatDave:

Thanks so much for your reply.

I will try to find the formula.

For this way, your estimate is the diff. Then in the output, it has a p-value for P >|t| for the estimate.

But our z stat for Wilcoxon test should be estimate/stddevofsum, as you wrote; so is that p-value for our test?

Thanks a lot!!

Xiaoshu

Season · Posted 10-15-2023 09:24 AM

Hello, Dave. Despite my continuous effort on the very specific issue of pooling Wilcoxon test results in the past month, I found joining the conversation here still fruitful. It suddenly dawned upon me that the methods I mentioned may be too complicated, a small modification of your method may be a good choice. Still, I have some issues regarding to your code.

(1) Combine sum-of-rank or the z-statistic? In your code, the variable you pooled via PROC MIANALYZE was sum-of-rank, which may violate the rationale of Rubin's rule of pooling estimands, since
Rubin's rule was based upon asymptotic normal distribution of the pooled estimand. In Wilcoxon sum-of-rank test, it is the z-statistic rather than the sum of ranks that follow an asymptotic normal distribution. Therefore, we should pool the z-statistics instead.

(2) Potential necessity to specify the EDF= option in PROC MIANALYZE. I wonder if you forgot to specify the EDF= option to override the infinite degrees of freedom defaulted by PROC MIANALYZE.

So, in conclusion, I think the most convenient way of pooling results of Wilcoxon sum-of-rank tests is as follows: (1) Obtain the z-statistic of each imputed sample; (2) Pool them via PROC MIANALYZE; (3) Obtain the results.

The rationale is as follows: now that the z-statistic correspond to the departure from null hypothesis in each sample and that a z-statistic of 0 stands for not rejecting the null hypothesis. Pooling the Wilcoxon test results translates into a one-sample t-test problem. That is: we have M sample values of a certain statistic (in this case it is the z-statistic) following an asymptotic normal distribution, we would like to see if the population mean of the statistic is 0. The pooling of imputed sample z-statistics is no different than pooling imputed sample means or standard deviations in multiple imputation, which can be easily done in PROC MIANALYZE.

The biggest challenge in doing so is to ascertain the standard error of each z-statistic, which is required by PROC MIANALYZE. I had no idea how to compute it in the first place given that we only have one z-statistic each sample, so it would be impossible to compute neither the sample standard deviation nor the sample standard error. But Licht's work enlightened me by pointing out that the z-statistics generated from Wilcoxon sum-or-rank tests essentially follow a standard normal distribution. In the case of pooling the z-statistics, each sample only computes one z-statistic, so all of the population standard errors of the z-statistics are 1/sqrt(1)=1. That problem was solved! We can simply apply a code instructing SAS to add a row of all 1s and use this row as the standard errors.

Now we finally discuss the EDF= issue. Admittedly, I have not read any literature introducing the concept of effective degrees of freedom in multiple imputation aside from those pertaining to SAS, and I found the explanation SAS Help provided still not that clear. So I also wonder the exact definition of EDF and whether we should specify this option here. From my view, I think it unnecessary to specify the EDF= option here, given that the EDF= option stands for the degrees of freedom of each and every statistic combined. Now that (1) the z-statistics follow a standard normal distribution to which the concept of degrees of freedom does not apply and (2) the t distribution is also asymptotically standard normal, perhaps we can deem each z-statistic as having infinite degrees of freedom, which is the default of PROC MIANALYZE. There is therefore no need to correct the effective degrees of freedom to a finite value.

xiaoshuxu · Posted 10-15-2023 02:41 PM

How to combine results from Wilcoxon Rank Sum Test for multiple imputed data sets from proc MI in SAS

Endpoint information:

We have seizure count collected for every day and therefore there will be some missing for some days.

We got average seizure frequency per 28-day, for an interval. That is, (seizure acount for a interval)/ (days with available seizure count during the interval)*28. For example, baseline period (28 days) DB period (99 days).
Then the endpoint is percent change from baseline in seizure frequency. (per 28-day seizure frequency during DB - per 28-day seizure frequency during baseline)/(per 28-days seizure frequency during baseline) 100% .

We will impute seizure count for each day if it is missing. So we will have m (say 10) imputed seizure count data sets.

Q1. After imputation, we plan calculate the endpoint for each imputed data, is this correct? Can we stack all the 10 data sets and then calculate the endpoint? Q2. Assume we calculate the endpoint for each imputed data. Then do Wilcoxon Rank Sum test. We will have 10 p-values and 10 corresponding 'z' values, etc. How should we combine them together to get one pooled p-value? How should we make inferences based on the 10 imputed data sets?

Thanks.

Janet

Thanks a lot for your clear explanation. So, firstly, I know I should not consider do analysis on pooled imputed data sets but do analysis separately. Secondly, I have read some about Rubin's rule these days. But your summary is so clear that I understand much better. Third, some 'exact' method is a 'research' till this moment. But for Rubin's rule, from Wilcoxon Rank Sum Test output, which variable should I put into proc MIanalysis, 'z', S, or sumofscore, sumofscore - expectofSum, not think over yet, Any suggestion? Thanks again

Hi Season:

I read two your replies. It is so informative and both you and Dave have so many knowledges. I benefit from those a lot. Really appreciate. I saved this discussion.

These days, I have been searching and read for this issue, that is, pooling results from Wilcoxon Rand Sum test after getting analysis result from m multiple imputaed data sets. So far, it seems there is no consensus solution online.

1) I found online post, "But you will also not be able to use MIANALYZE to combine the nonparametric test but instead will need to combine the actual Chi-Square test statistics", and referred a macro from Allision, https://www.sas.upenn.edu/~allison/combchi.sas. This method looks like it is just one of methods your mentioned. It is for chi-square.

2) Maybe there are some R packages. But I have not identified a specific one yet.

3) I basically agree with you on "In your code, the variable you pooled via PROC MIANALYZE was sum-of-rank, which may violate the rationale of Rubin's rule of pooling estimands, since Rubin's rule was based upon asymptotic normal distribution of the pooled estimand. In Wilcoxon sum-of-rank test, it is the z-statistic rather than the sum of ranks that follow an asymptotic normal distribution. Therefore, we should pool the z-statistics instead."

4) I strongly believe that 'z' from Wilcoxon rank-sum test follows standard normal well. z ~ normal (0, 1), as you wrote, the sigma is just 1. If we have many imputed data (say 100), I have thought to run proc univariate to see if 'z' follows a standard normal.

5) I have tried this method, put 'z' and stderr with '1' into PROC MIANALYZE on my data. Below is what I cannot completely agree with you for the above.

In the output, the estimate of the 'z' is, as everyone knows, just simple arithmetis mean. There is a "t for H0, parameter = Theta0"; under it, the value is kind of close to the estimate of 'z'. There is a p-value of P>|t|. So, I sensed, this p-value is assuming the average of 'z' follows a non-central t distribution with non-central parameter of Theta0 under H0? If my understanding is correct, then I doubt this p-value is the 'pooled' p-value we want. Because what we want is a 'best 'z', following normal. Our pooled p-value should from the 'best' z from normal distribution directly. I would think just using the average 'z' to get p-value from normal distribution is a reasonable solution.

6) From #5 above, it goes back to my initial thinking in my question. I am trying to get a ‘pooled’ statistic (later I thought of, 'z' can be used directly, same as your thought.). a 'pooled sum of score' from each data set, a NEW expected sum of score, a pooled std under H0, etc. idea is not mature.

Again, thanks a lot.

Season · Posted 10-21-2023 12:23 AM

@xiaoshuxu wrote:
Q1. After imputation, we plan calculate the endpoint for each imputed data, is this correct? Can we stack all the 10 data sets and then calculate the endpoint?

Of course you can stack the datasets, but the correct way of dealing with missing data via multiple imputation (MI) is to calculate the statistics separately in each imputed dataset and combine (pool) them in some way.

@xiaoshuxu wrote:

Q2. Assume we calculate the endpoint for each imputed data. Then do Wilcoxon Rank Sum test. We will have 10 p-values and 10 corresponding 'z' values, etc. How should we combine them together to get one pooled p-value? How should we make inferences based on the 10 imputed data sets?

Thanks.

Janet

I had explained the ways of doing so in my previous replies.

@xiaoshuxu wrote:

Third, some 'exact' method is a 'research' till this moment. But for Rubin's rule, from Wilcoxon Rank Sum Test output, which variable should I put into proc MIanalysis, 'z', S, or sumofscore, sumofscore - expectofSum, not think over yet, Any suggestion? Thanks again

To be exact, it is not the exact methods of the Wilcoxon sum-of-ranks that are in development but rather the huge field of pooling point estimates of statistics from each individual MI-imputed dataset.

As I had explained in previous replies, both z-statistics and P-values can be pooled, but with totally different ways. There seems to be no research comparing the validity of results computed with the two methods, but from a time-saving perspective, you can pool the z-statistics as long as the asymptotic normality makes sense. Usually, there is not exact cut-off of the sample size required to deem the assumption of asymptotic normality holds. A rule of thumb of the cut-off may be 30. That is, if your sample size is larger than 30, then you can resort to pooling the z-statistics, rather than the P-values.

@xiaoshuxu wrote:
4) I strongly believe that 'z' from Wilcoxon rank-sum test follows standard normal well. z ~ normal (0, 1), as you wrote, the sigma is just 1. If we have many imputed data (say 100), I have thought to run proc univariate to see if 'z' follows a standard normal.

Of course you can conduct a normality test to see if the z-statistics of your samples followed a normal distribution. But I don't think it necessary.

@xiaoshuxu wrote:
5) I have tried this method, put 'z' and stderr with '1' into PROC MIANALYZE on my data. Below is what I cannot completely agree with you for the above.

In the output, the estimate of the 'z' is, as everyone knows, just simple arithmetis mean. There is a "t for H0, parameter = Theta0"; under it, the value is kind of close to the estimate of 'z'. There is a p-value of P>|t|. So, I sensed, this p-value is assuming the average of 'z' follows a non-central t distribution with non-central parameter of Theta0 under H0? If my understanding is correct, then I doubt this p-value is the 'pooled' p-value we want. Because what we want is a 'best 'z', following normal. Our pooled p-value should from the 'best' z from normal distribution directly. I would think just using the average 'z' to get p-value from normal distribution is a reasonable solution.

You have noticed something I noticed when I first delved deep into the field of multiple imputation. It is common in the field of missing data that the distribution of the parameters to be pooled differs from that of the pooled parameter. Consider the case of combining the regression coefficients of logistic regression. All of the regression coefficients to be combine follow an asymptotic normal distribution, yet it is a t-test that ultimately decides whether the pooled regression coefficient of the population is 0, since the pooled regression coefficient follows a t rather than a normal distribution. You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled.

But please note that PROC MIANALYZE is not universal in dealing with missing data. So it is not true that the pooled parameters in the entire field of MI all follow a t distribution. The D2 method I mentioned is an example. The parameters to be pooled follow a Chi-square distribution, yet the pooled parameter follows an F distribution.

You may deem the change in distribution in the course of pooling parameters in MI odd (that is what I thought when I was learning it), but that is the case.

@xiaoshuxu wrote:
6) From #5 above, it goes back to my initial thinking in my question. I am trying to get a ‘pooled’ statistic (later I thought of, 'z' can be used directly, same as your thought.). a 'pooled sum of score' from each data set, a NEW expected sum of score, a pooled std under H0, etc. idea is not mature.

I don't think the sum of ranks (I don't quite understand what the word "score" in the phrase "pooled sum of score" referred to) can be pooled directly because they don't follow an asymptotic normal distribution. Rather, the z-statistic, a transformed sum-of-rank, follows a standard normal distribution given a reasonable sample size.

xiaoshuxu · Posted 10-21-2023 09:49 PM

Hi Season:

Thank you so much for your thorough reply to all of my questions. You are so knowlegable. I read through all replies again.

Let me try to do a simple summary of your thoughts, which is more related to my issues, and my thinking till now:

1) "both z-statistics and P-values can be pooled, but with totally different ways." "from a time-saving perspective, you can pool the z-statistics as long as the asymptotic normality makes sense." I would prefer to pool 'z' as I believe our 'z' would follow normal.

2) "that the distribution of the parameters to be pooled differs from that of the pooled parameter." "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled." For my case, i.e., I am pooing my 'z' and my pooled 'z' follows a non-central t. So, I am still having question. The p-value from proc mianalyze for my pooled 'z' is what I should report? What I meant is: we have missing data; we did MI; we got m imputed data sets; we got m 'z' from wilcoxon test; we got a pooled mean from proc mianalyze, which is just simple mean of all of my 'z' (but the pooled mean follows a non-central t); I got a p-value from proc mianalysis, which is for how the t is away from the non-central t (that is H0). Then, in the end, I should report this p-value (from proc mianalyze) as the 'best guess' p-value for our original data?

3) Explanation about my original thinking: I was thinking to get a ‘pooled’ statistic on my own from SAS output window, either from 'sumofscore" or "expectunderH0' (words from sas output window; maybe a little differences from sas output data set), also use Std Dev
Under H0, etc . Later I thought I can directly pool 'z'. Till this moment, I am still not sure which method is the 'best'.

4) About stacking data directly: "Of course you can stack the datasets, but the correct way of dealing with missing data via multiple imputation (MI) is to calculate the statistics separately in each imputed dataset and combine (pool) them in some way." I know the generay way is not 'stacking'. But, for our data, what we imputed is the missing seizure count for missing days. Our endpoint is percent change of seizure frequency (averaged to per 28-day; there is some calculation after the imputation.). I read the book of "flexible imputation of missing data" by Buuren; it said a few sentences about stacking data directly. "If the scientific interest is solely restricted to the point estimate, then the
stacked imputed data can be validly used to obtain a quick unbiased estimate for linear models. Be aware that routine methods for calculating test statistics, confidence intervals, or p-values will provide invalid answers if applied to the stacked imputed data." No time to think over and no mature idea. I just feel that, simple 'stacking' and getting 'averaged' seizure count of each day from m imputed data sets seems making some sense.

Again, greatly appreciate your help!!

Season · Posted 10-22-2023 12:03 AM

@xiaoshuxu wrote:
2) "that the distribution of the parameters to be pooled differs from that of the pooled parameter." "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled." For my case, i.e., I am pooing my 'z' and my pooled 'z' follows a non-central t.

No, the last sentence is incorrect. Your z-statistics follow a standard normal distribution, which is not the same as a non-central t distribution.

@xiaoshuxu wrote:

So, I am still having question. The p-value from proc mianalyze for my pooled 'z' is what I should report?

Yes, that's correct.

@xiaoshuxu wrote:

What I meant is: we have missing data; we did MI; we got m imputed data sets; we got m 'z' from wilcoxon test; we got a pooled mean from proc mianalyze, which is just simple mean of all of my 'z' (but the pooled mean follows a non-central t); I got a p-value from proc mianalysis, which is for how the t is away from the non-central t (that is H0). Then, in the end, I should report this p-value (from proc mianalyze) as the 'best guess' p-value for our original data?

Your questions centers two topics: (1) Point estimate of the central tendency of the populations; (2) Hypothesis testing of populations. Regarding the first question, I don't think reporting the combined means is a good choice. Please check the normality of your epilepsy scores (via the complete cases). It is often because of violation of normality that the data analyst resort to Wilcoxon sum-of-rank tests to test intergroup differences. In this case, reporting the means of the two groups is inappropriate. Instead, you should report the medians. That is, you should calculate the medians of each imputed dataset and pool them as the eventual estimate of the central tendency of your populations. By the way, 1) according to central limit theorem, the means of any sample never follow a distribution other than the normal distribution, so your sentence "the pooled mean follows a non-central t" is incorrect; 2) there is currently no existing guideline as to how to pool the medians of the imputed dataset. Not long ago, I asked another user in the Community about that question. Here's the link of his/her answer: Missing value imputation. You can have a look.

Regarding the second question, as I had said before, you can directly report the result PROC MIANALYZE presented to you.

In short, you should report the pooled medians as a measure of central tendency and the P value PROC MIANALYZE presented to you as the result of hypothesis testing of intergroup difference. That is, your P value is not calculated from something regarding the median.

@xiaoshuxu wrote:
3) Explanation about my original thinking: I was thinking to get a ‘pooled’ statistic on my own from SAS output window, either from 'sumofscore" or "expectunderH0' (words from sas output window; maybe a little differences from sas output data set), also use Std Dev
Under H0, etc . Later I thought I can directly pool 'z'. Till this moment, I am still not sure which method is the 'best'.

Pooling the z-statistic is the best among the three statistics you mentioned. As I had stated before, the sum-of-ranks and expected sum-of-rank under H0 do not follow a normal distribution, which is what violates Rubin's rules of pooling the results of each imputed dataset.

@xiaoshuxu wrote:
4) About stacking data directly: "Of course you can stack the datasets, but the correct way of dealing with missing data via multiple imputation (MI) is to calculate the statistics separately in each imputed dataset and combine (pool) them in some way." I know the generay way is not 'stacking'. But, for our data, what we imputed is the missing seizure count for missing days. Our endpoint is percent change of seizure frequency (averaged to per 28-day; there is some calculation after the imputation.). I read the book of "flexible imputation of missing data" by Buuren; it said a few sentences about stacking data directly. "If the scientific interest is solely restricted to the point estimate, then the
stacked imputed data can be validly used to obtain a quick unbiased estimate for linear models. Be aware that routine methods for calculating test statistics, confidence intervals, or p-values will provide invalid answers if applied to the stacked imputed data." No time to think over and no mature idea. I just feel that, simple 'stacking' and getting 'averaged' seizure count of each day from m imputed data sets seems making some sense.

There are three concerns regarding these words. The first one is about van Buuren's words, he said that stacking the datasets and report the point estimate of the stacked dataset is a choice in linear models, but Wilcoxon sum-of-rank test is not a nonparametric method instead a linear model. Secondly, I don't know if this validity van Buuren stated concerning point estimate carries to hypothesis testing. Thirdly, you had stated that your endpoint was the percentage of change from baseline in seizure frequency, so it would be better to stick to your endpoint and conduct point estimate as well as hypothesis testing during the entire process of data analysis instead of deviating from it in the first place, try a different method and somehow try to reach your goal eventually in a detour as long as directly reaching your goal is a viable choice.

xiaoshuxu · Posted 10-23-2023 06:36 PM

Hi Season:

Thanks for your thorough replies for all of questions!

To focus on my goal, my questions 3) and 4) of last time can be put aside, and I agree with your thoughts basically about them.

You mentioned the Median thing, how to pool 'Median', and gave me a link. Appreicate! That is my another unsovled issue, I know. I am reading those discussions. I may post a question about that after reading, if I still have queston(s).

Now only for the p-value. So, in order to get a pooled p-value after imputation, pooling 'z' is the correct way. In the proc mianalysis, the estimate of my pooled 'z' is just the simple mean (Robin's rule is this, right), which should follow normal, by central limit theorem, correct? then my question will be: 'what quantity' follows the non-central t? It seem there is a contradict.

Maybe, the answer is, following normal is approximate, following t is precise ?

I read your second reply: Erratum: A mistake was made here. The correct sentence is: "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution when univariate instead of multivariate hypothesis testing is to be done, regardless of the distribution of the original parameters to be pooled".

And for p-value, what is the p-value from SAS output for, for the 'some qunatity' follows the non-central t, correct?

Since my estimate for my pooled z follows normal, why should I report that p-value?

Thanks again.

Season · Posted 10-22-2023 09:13 AM

@Season wrote:

You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled.

Erratum: A mistake was made here. The correct sentence is: "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution when univariate instead of multivariate hypothesis testing is to be done, regardless of the distribution of the original parameters to be pooled". Sorry for the mistake.

Season · Posted 10-15-2023 01:46 AM

Hello, I happen to ran into your problem around a month ago. Admittedly, little research has paid attention to that issue. Dave's reply is a solution. There are also two approaches (three methods) for you to choose:

Approach 1: Cited in Page 149 of van Buuren's Flexible Imputation of Missing Data, Second Edition and Table 2 of Combining estimates of interest in prognostic modelling studies after multiple imputation: current p.... The original work was done by Rubin. This method is also called the D2 method, whose nomenclature came from the statistic it computed. Note that the D2 method is used to combine test statistics following a Chi-square distribution and calculating the D2 statistic involves taking the square root of each of the test statistic. That is not applicable to z-statistic, since it is likely that it is negative (<0). So I think a potential way of using the D2 method to combine z-statistics of Wilcoxon tests is: (1) square the z-statistic obtained by SAS to change the distribution of the test statistic from normal into Chi-square; (2) Use the D2 method to pool the squared z-statistics; (3) Obtained P-values of pooled results.

Approach 2: It should be noted that the z-statistic in fact comes from normal approximation. The Wilcoxon sum-of-rank itself yields only P-values, as is the case of Fisher exact test of contingency tables. The exact Wilcoxon sum-of-rank test can be done in SAS by the EXACT statement (Please do not forget to append the Wilcoxon statement following the EXACT statement to save computation time!). The second approach focuses on combining the P-values themselves rather than z-statistics.

Method 1: Reported in Page 220 of Donald Rubin's Statistical analysis with missing data, 2nd Edition. Please note that this method applies only to one-sided test.

Method 2: Reported in Licht, C. (2010). New methods for generating significance levels from multiply-imputed data. PhD thesis, University of Bamberg, Bamberg, Germany. Note that this method was also originally designed for one-sided tests. The author gave a method of tackling two-sided tests with the method he/she proposed: Segregate two-sided tests into two one-sided tests. Please also note that the two-sided P-values of Wilcoxon sum-of-rank test itself are essentially sums of two one-sided tests. Details of exact tests in Wilcoxon sum-of-rank test can be found in SAS Help.

Please note that exact tests of Wilcoxon sum-of-rank test are extremely computer-intensive! I am running such a test with around 600 samples that were imputed 100 times on my workstation right now I am typing. It takes around 24 hours to have the test done. In many cases, SAS failed to return to an exact test result as a result of lack of memory.

Good luck!

Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.

Re: Wilcoxon rank sum test in SAS, how the expected sum and standard deviation of sum are calulated.