BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
JanetXu
Obsidian | Level 7

Study information:

Two pararell arms, active group and Placebo group. The planned sample size is 168 = 84*2. Since our primary endpoint is not normally distributed (known from other previous similar studies.), the primary analysis specified is Wilcoxon rank sum test. There is a planed interim analysis (IA), occuring at ~50% subjects complete DB period. The conditional power is planned to be calculated. The formula specified in the statistical analysis plan is as follows. 

 

normal CDF [ Zn/sqrt(t(1-t)) - Zα/2/sqrt(1-t)]. 

CDF = Cumulative Distribution Function.

Zn is the z statistics from Wilcoxon rank-sum test for the comparison between active vs Placebo.

t is the information factor, which is the actual number of the subjects in the study at the time of interim analysis divided by the planned number of subjects for the study (168 = 84*2), i.e., n/168.

 

Now it was decided the IA will be performed a little early. That is, totally there will be 78 or 80 subjects. Each arm will be 39 or 40. 

 

I already searched online, read papers, and used ChatGPT; but I still have questions that would like to get help from Experts here. 

1.  If plugging in the 'z' from the Wilcoxon rank sum test as Zn into the above formlua, as planned, to calculate the conditional power, will it be a valid approach with our sample size at the time of IA?

2. Will this formula be appropriate for our data, espeically with such sample size? If not, are there some other formula?

3. My current idea is, since my sample size is not large enough, I should do simulation, at least as a back-up for the planned 'plug z in' approach.  See how much differences there will be. But I am not sure how to do that.  Can anyone please help me on that?

 

Thanks a lot in advance. 

 

Janet

1 ACCEPTED SOLUTION

Accepted Solutions
Rick_SAS
SAS Super FREQ

You might find some of the links in this article useful:
The essential guide to bootstrapping in SAS - The DO Loop

For this problem, you might want to consider the "smooth bootstrap," which is a cross between a simulation and a bootstrap. Unlike the pure bootstrap, the resampling process for the smooth bootstrap generates synthetic observations that are NOT replicates of the real data. 

View solution in original post

7 REPLIES 7
JanetXu
Obsidian | Level 7

Thanks a  lot for your reply. I will look into the links. But, my questions is mainly about the concepts, whether using 'z' is OK or not. And if doing simulation, the basic steps concepwise. Maybe I should post it some more Statistics forums. 

 

Best regards,

 

Janet

Rick_SAS
SAS Super FREQ

The fact that you want to use the Wilcoxon rank-sum test suggests that you do not want to assume a distribution for the data in each group. But you won't be able to simulate the data if you don't have a model for the data-generating process. 

JanetXu
Obsidian | Level 7

Hi Rick:

 

That is true. What we know is that the data is NOT normal but we do not know what distribution it follows. So for me, either I need to figure out what the data is approximately distributed or using bootstrpt or kernal density estimation or so. I am in the process of learing those also. 

 

Thanks.

 

Janet

 

 

Rick_SAS
SAS Super FREQ

You might find some of the links in this article useful:
The essential guide to bootstrapping in SAS - The DO Loop

For this problem, you might want to consider the "smooth bootstrap," which is a cross between a simulation and a bootstrap. Unlike the pure bootstrap, the resampling process for the smooth bootstrap generates synthetic observations that are NOT replicates of the real data. 

JanetXu
Obsidian | Level 7

Hi Rick:

thanks a lot. this is helpful. I will look into it.

 

Best regards,

 

Janet

JanetXu
Obsidian | Level 7

Hi Rick:

 

Thanks again. I knew the basic of smooth bootstrapt now. Also, I know my sample size n = 39 each group is basically large enough for using 'z' for my conditional power formula. So the simulation thing will be just as a back-up. I may just use the simple non-parametric bootstrap. 

 

Janet

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 667 views
  • 1 like
  • 3 in conversation