BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
JanetXu
Fluorite | Level 6

Hi:

I would like to find documemtation in SAS, how the 'expected sum' and 'StdDev of sum' in SAS output are calculated. I want the formula. I cannot find it online so far. 

 

Background information: I am doing Multiple Imputation for my study. After getting analysis results (Wilcoxon rank sum test on each imputed data set), how should I combine to get a 'whole' p-value?

 

My idea so far: get a mean of 'sum of scores' for m imputed data sets. Then calculate a 'z' on my own (need the formula here). then get the p-value just by normal distribution, initially, maybe by t distribution too. 

 

Any other formal/standard way to combine the results? It seems to me that the proc mianalyse in SAS is not applicable to my case?

I am eagerly waiting for your expertise on this. 

 

Thanks.

 

Xiaoshu

1 ACCEPTED SOLUTION

Accepted Solutions
Season
Lapis Lazuli | Level 10

Hello, Dave. Despite my continuous effort on the very specific issue of pooling Wilcoxon test results in the past month, I found joining the conversation here still fruitful. It suddenly dawned upon me that the methods I mentioned may be too complicated, a small modification of your method may be a good choice. Still, I have some issues regarding to your code.

(1) Combine sum-of-rank or the z-statistic? In your code, the variable you pooled via PROC MIANALYZE was sum-of-rank, which may violate the rationale of Rubin's rule of pooling estimands, since
Rubin's rule was based upon asymptotic normal distribution of the pooled estimand. In Wilcoxon sum-of-rank test, it is the z-statistic rather than the sum of ranks that follow an asymptotic normal distribution. Therefore, we should pool the z-statistics instead.

(2) Potential necessity to specify the EDF= option in PROC MIANALYZE. I wonder if you forgot to specify the EDF= option to override the infinite degrees of freedom defaulted by PROC MIANALYZE.

So, in conclusion, I think the most convenient way of pooling results of Wilcoxon sum-of-rank tests is as follows: (1) Obtain the z-statistic of each imputed sample; (2) Pool them via PROC MIANALYZE; (3) Obtain the results.

The rationale is as follows: now that the z-statistic correspond to the departure from null hypothesis in each sample and that a z-statistic of 0 stands for not rejecting the null hypothesis. Pooling the Wilcoxon test results translates into a one-sample t-test problem. That is: we have sample values of a certain statistic (in this case it is the z-statistic) following an asymptotic normal distribution, we would like to see if the population mean of the statistic is 0. The pooling of imputed sample z-statistics is no different than pooling imputed sample means or standard deviations in multiple imputation, which can be easily done in PROC MIANALYZE. 

The biggest challenge in doing so is to ascertain the standard error of each z-statistic, which is required by PROC MIANALYZE. I had no idea how to compute it in the first place given that we only have one z-statistic each sample, so it would be impossible to compute neither the sample standard deviation nor the sample standard error. But Licht's work enlightened me by pointing out that the z-statistics generated from Wilcoxon sum-or-rank tests essentially follow a standard normal distribution. In the case of pooling the z-statistics, each sample only computes one z-statistic, so all of the population standard errors of the z-statistics are 1/sqrt(1)=1. That problem was solved! We can simply apply a code instructing SAS to add a row of all 1s and use this row as the standard errors.

Now we finally discuss the EDF= issue. Admittedly, I have not read any literature introducing the concept of effective degrees of freedom in multiple imputation aside from those pertaining to SAS, and I found the explanation SAS Help provided still not that clear. So I also wonder the exact definition of EDF and whether we should specify this option here. From my view, I think it unnecessary to specify the EDF= option here, given that the EDF= option stands for the degrees of freedom of each and every statistic combined. Now that (1) the z-statistics follow a standard normal distribution to which the concept of degrees of freedom does not apply and (2) the distribution is also asymptotically standard normal, perhaps we can deem each z-statistic as having infinite degrees of freedom, which is the default of PROC MIANALYZE. There is therefore no need to correct the effective degrees of freedom to a finite value.

View solution in original post

12 REPLIES 12
StatDave
SAS Super FREQ

The formulas are in the Details section of the NPAR1WAY documentation. See the sections titled "Simple Linear Rank Tests for Two-Sample Data". The statistic is the sum of scores minus its expectation plus (by default) a continuity correction. The necessary values are available in the the ODS Table named WilcoxonScores - it contains the sums of scores, the expected sums, and the standard deviations. You can use those to compute the statistic and then provide the variables containing the statistic and the standard deviations to PROC MIANALYZE. For example, assuming that there are two treatments (1 and 2) and that treatment 1 is either has the smaller number of observations or is the first in the data set (if the treatments have number), then the following extracts the data, computes the statistic (as described in the documentation), and combines the imputations.

ods select none;
proc npar1way wilcoxon data=imputed_data;
   by _imputation_;
   class trt; var y;
   ods output wilcoxonscores=wscore(where=(class='1'));
   run;
ods select all;
data wstat; set wscore;
   w=(sumofscores-expectedsum);
   estimate=w + (w<0)*.5 - (w>0)*.5; *apply continuity correction;
   z=estimate/stddevofsum; *recompute the z statistic to check;
   run;
proc mianalyze data=wstat;
   modeleffects Estimate;
   stderr stddevofsum;
   run;
JanetXu
Fluorite | Level 6

Hi StatDave:

Thanks so much for your reply. 

I will try to find the formula. 

 

For this way, your estimate is the diff. Then in the output, it has a p-value for P >|t| for the estimate.

 

But our z stat for Wilcoxon test should be estimate/stddevofsum, as you wrote; so is that p-value for our test? 

 

Thanks a lot!!

 

Xiaoshu 

 

 

Season
Lapis Lazuli | Level 10

Hello, Dave. Despite my continuous effort on the very specific issue of pooling Wilcoxon test results in the past month, I found joining the conversation here still fruitful. It suddenly dawned upon me that the methods I mentioned may be too complicated, a small modification of your method may be a good choice. Still, I have some issues regarding to your code.

(1) Combine sum-of-rank or the z-statistic? In your code, the variable you pooled via PROC MIANALYZE was sum-of-rank, which may violate the rationale of Rubin's rule of pooling estimands, since
Rubin's rule was based upon asymptotic normal distribution of the pooled estimand. In Wilcoxon sum-of-rank test, it is the z-statistic rather than the sum of ranks that follow an asymptotic normal distribution. Therefore, we should pool the z-statistics instead.

(2) Potential necessity to specify the EDF= option in PROC MIANALYZE. I wonder if you forgot to specify the EDF= option to override the infinite degrees of freedom defaulted by PROC MIANALYZE.

So, in conclusion, I think the most convenient way of pooling results of Wilcoxon sum-of-rank tests is as follows: (1) Obtain the z-statistic of each imputed sample; (2) Pool them via PROC MIANALYZE; (3) Obtain the results.

The rationale is as follows: now that the z-statistic correspond to the departure from null hypothesis in each sample and that a z-statistic of 0 stands for not rejecting the null hypothesis. Pooling the Wilcoxon test results translates into a one-sample t-test problem. That is: we have sample values of a certain statistic (in this case it is the z-statistic) following an asymptotic normal distribution, we would like to see if the population mean of the statistic is 0. The pooling of imputed sample z-statistics is no different than pooling imputed sample means or standard deviations in multiple imputation, which can be easily done in PROC MIANALYZE. 

The biggest challenge in doing so is to ascertain the standard error of each z-statistic, which is required by PROC MIANALYZE. I had no idea how to compute it in the first place given that we only have one z-statistic each sample, so it would be impossible to compute neither the sample standard deviation nor the sample standard error. But Licht's work enlightened me by pointing out that the z-statistics generated from Wilcoxon sum-or-rank tests essentially follow a standard normal distribution. In the case of pooling the z-statistics, each sample only computes one z-statistic, so all of the population standard errors of the z-statistics are 1/sqrt(1)=1. That problem was solved! We can simply apply a code instructing SAS to add a row of all 1s and use this row as the standard errors.

Now we finally discuss the EDF= issue. Admittedly, I have not read any literature introducing the concept of effective degrees of freedom in multiple imputation aside from those pertaining to SAS, and I found the explanation SAS Help provided still not that clear. So I also wonder the exact definition of EDF and whether we should specify this option here. From my view, I think it unnecessary to specify the EDF= option here, given that the EDF= option stands for the degrees of freedom of each and every statistic combined. Now that (1) the z-statistics follow a standard normal distribution to which the concept of degrees of freedom does not apply and (2) the distribution is also asymptotically standard normal, perhaps we can deem each z-statistic as having infinite degrees of freedom, which is the default of PROC MIANALYZE. There is therefore no need to correct the effective degrees of freedom to a finite value.

JanetXu
Fluorite | Level 6

 

How to combine results from Wilcoxon Rank Sum Test for multiple imputed data sets from proc MI in SAS

Endpoint information:

We have seizure count collected for every day and therefore there will be some missing for some days.

  1. We got average seizure frequency per 28-day, for an interval. That is, (seizure acount for a interval)/ (days with available seizure count during the interval)*28. For example, baseline period (28 days) DB period (99 days).
  2. Then the endpoint is percent change from baseline in seizure frequency. (per 28-day seizure frequency during DB - per 28-day seizure frequency during baseline)/(per 28-days seizure frequency during baseline) 100% .

We will impute seizure count for each day if it is missing. So we will have m (say 10) imputed seizure count data sets.

Q1. After imputation, we plan calculate the endpoint for each imputed data, is this correct? Can we stack all the 10 data sets and then calculate the endpoint? Q2. Assume we calculate the endpoint for each imputed data. Then do Wilcoxon Rank Sum test. We will have 10 p-values and 10 corresponding 'z' values, etc. How should we combine them together to get one pooled p-value? How should we make inferences based on the 10 imputed data sets?

Thanks.

Janet

 

 

Thanks a lot for your clear explanation. So, firstly, I know I should not consider do analysis on pooled imputed data sets but do analysis separately. Secondly, I have read some about Rubin's rule these days. But your summary is so clear that I understand much better. Third, some 'exact' method is a 'research' till this moment.  But for Rubin's rule, from Wilcoxon Rank Sum Test output, which variable should I put into proc MIanalysis, 'z', S, or sumofscore, sumofscore - expectofSum, not think over yet, Any suggestion? Thanks again

 

 

 

 

 

Hi Season:

 

I read two your replies. It is so informative and both you and Dave have so many knowledges. I benefit from those a lot. Really appreciate. I saved this discussion.

These days, I have been searching and read for this issue, that is, pooling results from Wilcoxon Rand Sum test after getting analysis result from m multiple imputaed data sets.  So far, it seems there is no consensus solution online. 

1) I found online post, "But you will also not be able to use MIANALYZE to combine the nonparametric test but instead will need to combine the actual Chi-Square test statistics", and referred a macro from Allision, https://www.sas.upenn.edu/~allison/combchi.sas. This method looks like it is just one of methods your mentioned. It is for chi-square.  

2) Maybe there are some R packages. But I have not identified a specific one yet. 

 3) I basically agree with you on "In your code, the variable you pooled via PROC MIANALYZE was sum-of-rank, which may violate the rationale of Rubin's rule of pooling estimands, since Rubin's rule was based upon asymptotic normal distribution of the pooled estimand. In Wilcoxon sum-of-rank test, it is the z-statistic rather than the sum of ranks that follow an asymptotic normal distribution. Therefore, we should pool the z-statistics instead."

4) I strongly believe that 'z' from Wilcoxon rank-sum test follows standard normal well. z ~ normal (0, 1), as you wrote, the sigma is just 1. If we have many imputed data (say 100), I have thought to run proc univariate to see if 'z' follows a standard normal. 

 5) I have tried this method, put 'z' and stderr with '1' into PROC MIANALYZE on my data.  Below is what I cannot completely agree with you for the above. 

In the output, the estimate of the 'z' is, as everyone knows, just simple arithmetis mean. There is a "t for H0, parameter = Theta0"; under it, the value is kind of close to the estimate of 'z'. There is a p-value of P>|t|. So, I sensed, this p-value is assuming the average of 'z' follows a non-central t distribution with non-central parameter of Theta0 under H0? If my understanding is correct, then I doubt this p-value is the 'pooled'  p-value we want. Because what we want is a 'best 'z', following normal. Our pooled p-value should from the 'best' z from normal distribution directly.  I would think just using the average 'z' to get p-value from normal distribution is a reasonable solution.   

 6) From #5 above, it goes back to my initial thinking in my question. I am trying to get a ‘pooled’ statistic (later I thought of, 'z' can be used directly, same as your thought.). a  'pooled sum of score' from each data set, a NEW expected sum of score, a pooled std under H0, etc. idea is not mature. 

 

Again, thanks a lot.

 

 

Season
Lapis Lazuli | Level 10

@JanetXu wrote:

Q1. After imputation, we plan calculate the endpoint for each imputed data, is this correct? Can we stack all the 10 data sets and then calculate the endpoint?

Of course you can stack the datasets, but the correct way of dealing with missing data via multiple imputation (MI) is to calculate the statistics separately in each imputed dataset and combine (pool) them in some way.


@JanetXu wrote:

Q2. Assume we calculate the endpoint for each imputed data. Then do Wilcoxon Rank Sum test. We will have 10 p-values and 10 corresponding 'z' values, etc. How should we combine them together to get one pooled p-value? How should we make inferences based on the 10 imputed data sets?

Thanks.

Janet


I had explained the ways of doing so in my previous replies.


@JanetXu wrote:

Third, some 'exact' method is a 'research' till this moment.  But for Rubin's rule, from Wilcoxon Rank Sum Test output, which variable should I put into proc MIanalysis, 'z', S, or sumofscore, sumofscore - expectofSum, not think over yet, Any suggestion? Thanks again

 


To be exact, it is not the exact methods of the Wilcoxon sum-of-ranks that are in development but rather the huge field of pooling point estimates of statistics from each individual MI-imputed dataset.

As I had explained in previous replies, both z-statistics and P-values can be pooled, but with totally different ways. There seems to be no research comparing the validity of results computed with the two methods, but from a time-saving perspective, you can pool the z-statistics as long as the asymptotic normality makes sense. Usually, there is not exact cut-off of the sample size required to deem the assumption of asymptotic normality holds. A rule of thumb of the cut-off may be 30. That is, if your sample size is larger than 30, then you can resort to pooling the z-statistics, rather than the P-values.


@JanetXu wrote:

4) I strongly believe that 'z' from Wilcoxon rank-sum test follows standard normal well. z ~ normal (0, 1), as you wrote, the sigma is just 1. If we have many imputed data (say 100), I have thought to run proc univariate to see if 'z' follows a standard normal. 

 


Of course you can conduct a normality test to see if the z-statistics of your samples followed a normal distribution. But I don't think it necessary.


@JanetXu wrote:

 5) I have tried this method, put 'z' and stderr with '1' into PROC MIANALYZE on my data.  Below is what I cannot completely agree with you for the above. 

In the output, the estimate of the 'z' is, as everyone knows, just simple arithmetis mean. There is a "t for H0, parameter = Theta0"; under it, the value is kind of close to the estimate of 'z'. There is a p-value of P>|t|. So, I sensed, this p-value is assuming the average of 'z' follows a non-central t distribution with non-central parameter of Theta0 under H0? If my understanding is correct, then I doubt this p-value is the 'pooled'  p-value we want. Because what we want is a 'best 'z', following normal. Our pooled p-value should from the 'best' z from normal distribution directly.  I would think just using the average 'z' to get p-value from normal distribution is a reasonable solution.   

 

 


You have noticed something I noticed when I first delved deep into the field of multiple imputation. It is common in the field of missing data that the distribution of the parameters to be pooled differs from that of the pooled parameter. Consider the case of combining the regression coefficients of logistic regression. All of the regression coefficients to be combine follow an asymptotic normal distribution, yet it is a t-test that ultimately decides whether the pooled regression coefficient of the population is 0, since the pooled regression coefficient follows a rather than a normal distribution. You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled.

But please note that PROC MIANALYZE is not universal in dealing with missing data. So it is not true that the pooled parameters in the entire field of MI all follow a t distribution. The D2 method I mentioned is an example. The parameters to be pooled follow a Chi-square distribution, yet the pooled parameter follows an distribution.

You may deem the change in distribution in the course of pooling parameters in MI odd (that is what I thought when I was learning it), but that is the case.


@JanetXu wrote:

 6) From #5 above, it goes back to my initial thinking in my question. I am trying to get a ‘pooled’ statistic (later I thought of, 'z' can be used directly, same as your thought.). a  'pooled sum of score' from each data set, a NEW expected sum of score, a pooled std under H0, etc. idea is not mature. 


I don't think the sum of ranks (I don't quite understand what the word "score" in the phrase "pooled sum of score" referred to) can be pooled directly because they don't follow an asymptotic normal distribution. Rather, the z-statistic, a transformed sum-of-rank, follows a standard normal distribution given a reasonable sample size.

JanetXu
Fluorite | Level 6

Hi Season:

 

Thank you so much for your thorough reply to all of my questions.  You are so knowlegable. I read through all replies again. 

 

Let me try to do a simple summary of your thoughts, which is more related to my issues, and my thinking till now:

1) "both z-statistics and P-values can be pooled, but with totally different ways."  "from a time-saving perspective, you can pool the z-statistics as long as the asymptotic normality makes sense."  I would prefer to pool 'z' as I believe our 'z' would follow normal. 

2) "that the distribution of the parameters to be pooled differs from that of the pooled parameter." "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled."  For my case, i.e., I am pooing my 'z' and my pooled 'z' follows a non-central t.  So, I am still having question. The p-value from proc mianalyze for my pooled 'z' is what I should report? What I meant is: we have missing data; we did MI; we got m imputed data sets; we got m 'z' from wilcoxon test; we got a pooled mean from proc mianalyze, which is just simple mean of all of my 'z' (but the pooled mean follows a non-central t); I got a p-value from proc mianalysis, which is for how the t is away from the non-central t (that is H0).  Then, in the end, I should report this p-value (from proc mianalyze) as the 'best guess' p-value for our original data?  

 

3) Explanation about my original thinking:  I was thinking to get a ‘pooled’ statistic on my own from SAS output window, either from 'sumofscore" or "expectunderH0' (words from sas output window; maybe a little differences from sas output data set), also use Std Dev
Under H0, etc . Later I thought I can directly pool 'z'.  Till this moment, I am still not sure which method is the 'best'.

 

4)  About stacking data directly:  "Of course you can stack the datasets, but the correct way of dealing with missing data via multiple imputation (MI) is to calculate the statistics separately in each imputed dataset and combine (pool) them in some way." I know the generay way is not 'stacking'.  But, for our data, what we imputed is the missing seizure count for missing days. Our endpoint is percent change of seizure frequency (averaged to per 28-day; there is some calculation after the imputation.).  I read the book of "flexible imputation of missing data" by Buuren; it said a few sentences about stacking data directly.  "If the scientific interest is solely restricted to the point estimate, then the
stacked imputed data can be validly used to obtain a quick unbiased estimate for linear models. Be aware that routine methods for calculating test statistics, confidence intervals, or p-values will provide invalid answers if applied to the stacked imputed data."  No time to think over and no mature idea.  I just feel that, simple 'stacking' and getting 'averaged' seizure count of each day from m imputed data sets seems making some sense.  

 

Again, greatly appreciate your help!! 

Season
Lapis Lazuli | Level 10

@JanetXu wrote:

2) "that the distribution of the parameters to be pooled differs from that of the pooled parameter." "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled."  For my case, i.e., I am pooing my 'z' and my pooled 'z' follows a non-central t. 


No, the last sentence is incorrect. Your z-statistics follow a standard normal distribution, which is not the same as a non-central distribution.


@JanetXu wrote:

So, I am still having question. The p-value from proc mianalyze for my pooled 'z' is what I should report?


Yes, that's correct.


@JanetXu wrote:

What I meant is: we have missing data; we did MI; we got m imputed data sets; we got m 'z' from wilcoxon test; we got a pooled mean from proc mianalyze, which is just simple mean of all of my 'z' (but the pooled mean follows a non-central t); I got a p-value from proc mianalysis, which is for how the t is away from the non-central t (that is H0).  Then, in the end, I should report this p-value (from proc mianalyze) as the 'best guess' p-value for our original data?  


Your questions centers two topics: (1) Point estimate of the central tendency of the populations; (2) Hypothesis testing of populations. Regarding the first question, I don't think reporting the combined means is a good choice. Please check the normality of your epilepsy scores (via the complete cases). It is often because of violation of normality that the data analyst resort to Wilcoxon sum-of-rank tests to test intergroup differences. In this case, reporting the means of the two groups is inappropriate. Instead, you should report the medians. That is, you should calculate the medians of each imputed dataset and pool them as the eventual estimate of the central tendency of your populations. By the way, 1) according to central limit theorem, the means of any sample never follow a distribution other than the normal distribution, so your sentence "the pooled mean follows a non-central t" is incorrect; 2) there is currently no existing guideline as to how to pool the medians of the imputed dataset. Not long ago, I asked another user in the Community about that question. Here's the link of his/her answer: Missing value imputation. You can have a look.

Regarding the second question, as I had said before, you can directly report the result PROC MIANALYZE presented to you.

In short, you should report the pooled medians as a measure of central tendency and the value PROC MIANALYZE presented to you as the result of hypothesis testing of intergroup difference. That is, your value is not calculated from something regarding the median.


@JanetXu wrote:

3) Explanation about my original thinking:  I was thinking to get a ‘pooled’ statistic on my own from SAS output window, either from 'sumofscore" or "expectunderH0' (words from sas output window; maybe a little differences from sas output data set), also use Std Dev
Under H0, etc . Later I thought I can directly pool 'z'.  Till this moment, I am still not sure which method is the 'best'.


Pooling the z-statistic is the best among the three statistics you mentioned. As I had stated before, the sum-of-ranks and expected sum-of-rank under H0 do not follow a normal distribution, which is what violates Rubin's rules of pooling the results of each imputed dataset.
 


@JanetXu wrote:

4)  About stacking data directly:  "Of course you can stack the datasets, but the correct way of dealing with missing data via multiple imputation (MI) is to calculate the statistics separately in each imputed dataset and combine (pool) them in some way." I know the generay way is not 'stacking'.  But, for our data, what we imputed is the missing seizure count for missing days. Our endpoint is percent change of seizure frequency (averaged to per 28-day; there is some calculation after the imputation.).  I read the book of "flexible imputation of missing data" by Buuren; it said a few sentences about stacking data directly.  "If the scientific interest is solely restricted to the point estimate, then the
stacked imputed data can be validly used to obtain a quick unbiased estimate for linear models. Be aware that routine methods for calculating test statistics, confidence intervals, or p-values will provide invalid answers if applied to the stacked imputed data."  No time to think over and no mature idea.  I just feel that, simple 'stacking' and getting 'averaged' seizure count of each day from m imputed data sets seems making some sense.  


There are three concerns regarding these words. The first one is about van Buuren's words, he said that stacking the datasets and report the point estimate of the stacked dataset is a choice in linear models, but Wilcoxon sum-of-rank test is not a nonparametric method instead a linear model. Secondly, I don't know if this validity van Buuren stated concerning point estimate carries to hypothesis testing. Thirdly, you had stated that your endpoint was the percentage of change from baseline in seizure frequency, so it would be better to stick to your endpoint and conduct point estimate as well as hypothesis testing during the entire process of data analysis instead of deviating from it in the first place, try a different method and somehow try to reach your goal eventually in a detour as long as directly reaching your goal is a viable choice.

JanetXu
Fluorite | Level 6

Hi Season:

Thanks for your thorough replies for all of questions! 

To focus on my goal, my questions 3) and 4) of last time can be put aside,  and I agree with your thoughts basically about them.

 

You mentioned the Median thing, how to pool 'Median', and gave me a link. Appreicate! That is my another unsovled issue, I know. I am reading those discussions. I may post a question about that after reading, if I still have queston(s).

 

Now only for the p-value. So, in order to get a pooled p-value after imputation, pooling 'z' is the correct way.  In the proc mianalysis, the estimate of my pooled 'z' is just the simple mean (Robin's rule is this, right), which should follow normal, by central limit theorem, correct? then my question will be:  'what quantity' follows the non-central t?   It seem there is a contradict.   

 

Maybe, the answer is, following normal is approximate, following t is precise ?

 

I read your second reply:  Erratum: A mistake was made here. The correct sentence is: "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution when univariate instead of multivariate hypothesis testing is to be done, regardless of the distribution of the original parameters to be pooled".  

 

And for p-value, what is the p-value from SAS output for,  for the 'some qunatity' follows the non-central t, correct?  

Since my estimate for my pooled z follows normal, why should I report that p-value?

 

Thanks again. 

Season
Lapis Lazuli | Level 10

First of all, thank you for your compliments. It is a pleasure discussing statistical issues with you. What is more, I happened to log in my SAS account tonight to download appendical materials for a SAS book I was reading. Surprisingly, I found that the majority of the number of times of the our discussions reached a staggering level of some 2000, outnumbering virtually almost any other appearances on this Community. It seems the topics we have been discussing is of particular interest to a great number of people, which further motivates me to go on the discussions.

However, I would like to say a big "sorry" for responding to your latest questions so late. I was very, very, very busy last October, when our discussions took place. Now I have finished the projects I had been working on, so I have time to discuss the issues you mentioned in depth with you.


@JanetXu wrote:

Now only for the p-value. So, in order to get a pooled p-value after imputation, pooling 'z' is the correct way.  In the proc mianalysis, the estimate of my pooled 'z' is just the simple mean (Robin's rule is this, right), which should follow normal, by central limit theorem, correct? then my question will be:  'what quantity' follows the non-central t?   It seem there is a contradict.   


I am afraid that you had not been quite familiar with imputation by the time you posted your replies. First of all, the paragraph consisted of several small mistakes. It is "Rubin's rule" instead of "Robin's rule" that was applied. This rule was named after Donald Rubin, who made remarkable contributions to the field of multiple imputation. In addition, the module handling the pooling process in SAS is called "PROC MIANALYZE".

Second, let me explain the pooling process further. Rubin's rule refers to pooling the variables of interest after calculating their values separately on each and every of the imputed datasets. Let denote the number of imputed datasets (i.e., the number of times you impute), Q denote the variable you are interested in knowing its true value in the population but is complicated by missing values and Q1, Q2, ..., Qm denote the values of that varaible you calculate from each imputed dataset. To apply Rubin's rules, Q1, Q2, ..., Qm should follow t distributions (please note that it is not non-central distributions) or have asymptotic normality. If that condition holds (i.e. is satisified), then you estimate Q by U, which equals (Q1+Q2+...+Qm)/m. U, on the other hand, follows a distribution. Hypotheses testing on Q can be substituted by those on U, generating P-values, which is something that you want in the first place. Now that  follows a distribution, the P-values are calculated by referencing the value of against a distribution.

Let us return to your specific question now. The central limit theorem concerns the distribution of means, which has little to do with U. You can of course argue that is in fact a particular type of mean, so it should follow the central limit theorem as well. That is true, but in both theory and practice, we should reference against instead of normal distributions.

By the time you read this line, you can skim through the passages I have wrote and try to find out a place where non-central distributions appears. There is nowhere, right? So in the entire framework that a practitioner should master for handling missing data, no variable follows a non-central distribution.


@JanetXu wrote:

Maybe, the answer is, following normal is approximate, following t is precise ?


No, not true. When we apply Rubin's rules, we always reference against distributions.


@JanetXu wrote:

And for p-value, what is the p-value from SAS output for,  for the 'some qunatity' follows the non-central t, correct?  

Since my estimate for my pooled z follows normal, why should I report that p-value?


The answer to the first question is "not true" again. Please refer to my elaboration on Rubin's rules for details.

As for the second question, the pooled z's (which I denoted as U) follow a t distribution.

JanetXu
Fluorite | Level 6
Hi Season:
Thanks for your recent reply! I am just on this topic these days. (In fact, I saved the previous discussion into a PDF file.) So, I got to this discussion again. But I did not realize there is a NEW reply (June 7, 2024. Date not shown yet)! I felt it seems I have not read this, not that familiar. I will read carefully and then may reply again.
Thanks.

Xiaoshu
Season
Lapis Lazuli | Level 10

@Season wrote:

You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution, regardless of the distribution of the original parameters to be pooled.

Erratum: A mistake was made here. The correct sentence is: "You can safely conclude that all of the pooled parameters in PROC MIANALYZE follow a t distribution when univariate instead of multivariate hypothesis testing is to be done, regardless of the distribution of the original parameters to be pooled". Sorry for the mistake.

Season
Lapis Lazuli | Level 10

Hello, I happen to ran into your problem around a month ago. Admittedly, little research has paid attention to that issue. Dave's reply is a solution. There are also two approaches (three methods) for you to choose:

Approach 1: Cited in Page 149 of van Buuren's Flexible Imputation of Missing Data, Second Edition and Table 2 of Combining estimates of interest in prognostic modelling studies after multiple imputation: current p.... The original work was done by Rubin. This method is also called the D2 method, whose nomenclature came from the statistic it computed. Note that the D2 method is used to combine test statistics following a Chi-square distribution and calculating the D2 statistic involves taking the square root of each of the test statistic. That is not applicable to z-statistic, since it is likely that it is negative (<0). So I think a potential way of using the D2 method to combine z-statistics of Wilcoxon tests is: (1) square the z-statistic obtained by SAS to change the distribution of the test statistic from normal into Chi-square; (2) Use the D2 method to pool the squared z-statistics; (3) Obtained P-values of pooled results.

Approach 2: It should be noted that the z-statistic in fact comes from normal approximation. The Wilcoxon sum-of-rank itself yields only P-values, as is the case of Fisher exact test of contingency tables. The exact Wilcoxon sum-of-rank test can be done in SAS by the EXACT statement (Please do not forget to append the Wilcoxon statement following the EXACT statement to save computation time!). The second approach focuses on combining the P-values themselves rather than z-statistics.

Method 1: Reported in Page 220 of Donald Rubin's Statistical analysis with missing data, 2nd Edition. Please note that this method applies only to one-sided test.

Method 2: Reported in Licht, C. (2010). New methods for generating significance levels from multiply-imputed data. PhD thesis, University of Bamberg, Bamberg, Germany. Note that this method was also originally designed for one-sided tests. The author gave a method of tackling two-sided tests with the method he/she proposed: Segregate two-sided tests into two one-sided tests. Please also note that the two-sided P-values of Wilcoxon sum-of-rank test itself are essentially sums of two one-sided tests. Details of exact tests in Wilcoxon sum-of-rank test can be found in SAS Help.

Please note that exact tests of Wilcoxon sum-of-rank test are extremely computer-intensive! I am running such a test with around 600 samples that were imputed 100 times on my workstation right now I am typing. It takes around 24 hours to have the test done. In many cases, SAS failed to return to an exact test result as a result of lack of memory.

Good luck!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 4504 views
  • 6 likes
  • 3 in conversation