Solved: Re: Stratified Newcombe method when zero responder in control arm

Avinash_biostat · Posted 07-09-2024 09:53 AM

Dear,

I am calculating CI using stratified Newcombe method in Proc Freq, however i have control arm with zero responder and this study have two stratification factor, while running to code i m getting message that
"NOTE: Newcombe confidence limits for the common risk difference cannot be computed for this table (Treatment by Responder
controlling for startum) due to zero-frequency rows, columns, or cells."

I checked paper and it is possible that we can calculate CIs even L2 and U2 are zero. could you please help why SAS is not showing results in this particular case.

Mike_N · Posted 07-12-2024 10:44 AM

The macro is giving you the wrong answer. This is technical, but the macro doesn't properly handle missing values in the section that uses proc IML. In proc IML, as stated in the warning, division by zero produces a matrix of missing values. However, if you take the sum of a matrix of missing values, you will get zero, which is probably not what you expect. For example, try running the following code:

proc iml; 
	x = {. . , . . };
	print x;
	y = x[+, ];
	print y;
run;

The code stores the column sums of the matrix x in the matrix y. You'll note, however, that y is a matrix of zeros, even though the intuitive result would be a matrix of missing values. (The reasons for this behavior are beyond the scope of this post).

This comes into play in the following lines of the macro:

 newcombe_L2 =wilson_L2[+,];
 newcombe_U2 =wilson_U2[+,];

The wilson_L2 and wilson_U2 matrices contain only missing values, but newcombe_L2 and newcombe_U2 will contain zeros because of the behavior I described above.

Those zeros are not the correct values for the computation, and you get the wrong values for the confidence interval. Proc FREQ handles this situation correctly and gives you a note that the Newcombe confidence interval cannot be computed in this situation.

View solution in original post

ballardw · Posted 07-09-2024 12:20 PM

You should as a minimum include the code you are currently running, better would be to include a data step to create data or use the code on a SAS supplied data set that has the same behavior.

Better would be to include a link to a paper that references how to calculate the CI when L2 and U2 are 0.

Mike_N · Posted 07-09-2024 04:00 PM

The formulas for the stratified Newcombe confidence limits are given here: https://go.documentation.sas.com/doc/en/statcdc/14.3/statug/statug_freq_details62.htm

You will see that zero frequency rows/columns/cells will lead to division by zero in the formula, which is why you get that note.

Avinash_biostat · Posted 07-10-2024 10:27 AM

I wanted to understand from below formula where cause issue, if L2 and U2 are zero due to zero responder in strata 2, but we have values for L1 and U1 so if we include in Below L and U still you can Lower and Upper, but I am not sure whether it reliable or not.

You could refer below paper (SAS macro) where it provides the Stratified Newcombe confidence limit with zero responder in one of strata. But I am not sure how reliable it is, that could reason SAS is not showing results in such cases (where zero responder in one of strata)

https://www.pharmasug.org/proceedings/2013/SP/PharmaSUG-2013-SP04.pdf

Mike_N · Posted 07-10-2024 10:59 AM

Can you post the code you are running, and the associated log? Beyond giving you a reference to the appropriate documentation, it's hard to speculate on what might be going on.

Avinash_biostat · Posted 07-11-2024 07:15 AM

Pls see attachment of SAS code, suppose I have zero responder in strata 2 then "proc freq" unable to provide Newcombe confidence limit with note that,

" NOTE: Newcombe confidence limits for the common risk difference cannot be computed for this table (Treatment by Responder
controlling for startum) due to zero-frequency rows, columns, or cells."

However if I use that macro then I gets values for CI for Newcombe .

Please help why it is happening from SAS, why SAS is giving results, it is due to CI is not reliable even it is calculable ?

Mike_N · Posted 07-11-2024 01:30 PM

I get an error when I run this macro, and it fails to generate Newcombe confidence intervals. In the log, I see a 'divide by zero' warning, and a resulting error in computing the Newcome CI. In the paper you cite, look at the last term on the last line of page 2. In your simulated data, you have two strata and no responders in the placebo group in either stratum. Therefore, the denominator of that term is zero for the placebo group, which prevents the computation from proceeding.

Avinash_biostat · Posted 07-12-2024 02:59 AM

I apologies, i made some change in macro so it was giving error. I attached now corrected one.

I agree it gives warning "WARNING: Division by zero, result set to missing value.",

Avinash_biostat · Posted 07-12-2024 03:07 AM

Sorry see attached Marco with stimulation data , please ignore previous one

Mike_N · Posted 07-12-2024 10:44 AM

The macro is giving you the wrong answer. This is technical, but the macro doesn't properly handle missing values in the section that uses proc IML. In proc IML, as stated in the warning, division by zero produces a matrix of missing values. However, if you take the sum of a matrix of missing values, you will get zero, which is probably not what you expect. For example, try running the following code:

proc iml; 
	x = {. . , . . };
	print x;
	y = x[+, ];
	print y;
run;

The code stores the column sums of the matrix x in the matrix y. You'll note, however, that y is a matrix of zeros, even though the intuitive result would be a matrix of missing values. (The reasons for this behavior are beyond the scope of this post).

This comes into play in the following lines of the macro:

 newcombe_L2 =wilson_L2[+,];
 newcombe_U2 =wilson_U2[+,];

The wilson_L2 and wilson_U2 matrices contain only missing values, but newcombe_L2 and newcombe_U2 will contain zeros because of the behavior I described above.

Those zeros are not the correct values for the computation, and you get the wrong values for the confidence interval. Proc FREQ handles this situation correctly and gives you a note that the Newcombe confidence interval cannot be computed in this situation.

SAS Innovate 2025: Call for Content

Classroom Training Available!