Greeting!
I am learning the program code by Rick Wicklin and trying to use my own data to draw the funnel plot, which I think would be very useful to monitor the Site performance both by safety and efficacy.
https://blogs.sas.com/content/iml/2020/04/20/funnel-plot-covid-nc.html
I would like to thank Rick one more time for posting so many insightful and useful SAS graphics example, from which I benefit a lot already.
below is the program I changed based on Rick's example, which I ran successfully, but for some reason, my slightly updated code is not working:
Can someone take copy and paste the code in below and run, and let me know where is the issue.
THANK YOU SO MUCH!
/**************************************/
data Mortality;
length Sitename Label $27 ;
input Sitename Cases Deaths Proportion Label;
datalines;
Site1 20 2 0.1 Site1
Site2 12 3 0.25 Site2
Site3 2 . . Site3
Site4 7 1 0.14286 Site4
Site5 4 1 0.25 Site5
Site6 2 . . Site6
Site7 2 1 0.5 Site7
Site8 15 4 0.26667 Site8
Site9 6 4 0.66667 Site9
;
proc means data=Mortality P10 P25 P50 P75 P90 P95 P99 Max ndec=0;
var Cases;
run;
/*******************************/
%let DSName = Mortality;
%let Events = Deaths;
%let Trials = Cases;
/* compute overal rate in ESCAPE NEXT */
proc sql ;
select sum(&Events) / sum(&Trials) format=best6. into :AvgProp
from &DSName;
quit;
%put &=&AvgProp;
proc print data=Mortality ;
where Cases >= 2;
run;
/* Isolate the computation of the the control limits for the funnel plot
so it can be reused in multiple programs.
*/
proc iml;
/* Funnel plot for proportion:
https://blogs.sas.com/content/iml/2011/11/23/funnel-plots-for-proportions.html
Compute binomial limits as a function of the nunber of trials.
Adjust the binomial quantiles according to method in Appendix A.1.1
of Spiegelhalter (2005) p. 1197.
These limits do not depend on the data, although we assume that theta
(often the average proportion from the data) is the probability parameter.
Parameters:
theta (scalar): empirical probability of event = sum(Events) / sum(Trials)
nTrials (column vector): sequence of trials at which to evaluate the binomial limits.
For example, nTrials = T(1:100)
prob (row vector): probabilities at which to compute quantiles.
For example, prob = {0.025 0.975} for 95% (~ 2 StdDev) limits
prob = {0.001 0.025 0.975 0.999} for 2 and 3 StdDev limits
*/
start BinomialFunnel(theta, nTrials, prob);
n = round(colvec(nTrials));
n = n <> 1; /* must be integer > 0 */
p = rowvec(prob);
/* compute columns of matrix, one for each CL */
limits = j(nrow(n), ncol(p), .);
do i = 1 to ncol(p);
r = quantile("binom", p[i], theta, n);
numer = cdf("binom", r , theta, n) - p[i];
denom = cdf("binom", r , theta, n) -
cdf("binom", r-1, theta, n);
alpha = numer / denom;
limits[,i] = (r-alpha)/n;
end;
/* we can't have negative limits */
idx = loc(limits < 0);
if ncol(idx)>0 then limits[idx]=0;
/* return these control limits */
return ( n || limits );
finish;
store module=BinomialFunnel;
QUIT;
/* Compute the funnels for the COVID-19 data */
proc iml;
use Mortality;
read all var "&Events" into Events;
read all var "&Trials" into Trials;
close;
/* compute overall proportion */
theta = sum(Events) / sum(Trials);
/* compute confidence limits for range of sample size */
load module=BinomialFunnel;
/*nTrials = T(do(4,200,10)) // T( do(200, round(max(Trials)+20, 50), 50) );*/
nTrials = T(do(4,200,10)) // T( do(200, round(max(Trials)+20, 50), 50) );
prob = {0.025 0.975};
results = BinomialFunnel(theta, nTrials, prob);
labels = {"N" "L2sd" "U2sd"};
create Limits from results[colname=labels];
append from results;
close;
QUIT;
/* concatenate the data and the limits for the funnel plot */
data FunnelMortality;
set Mortality(keep=SITENAME Cases Deaths Proportion Label)
Limits;
run;
/* create funnel plot with data tips. See
https://blogs.sas.com/content/iml/2011/12/09/creating-tooltips-for-scatter-plots-with-proc-sgplot.ht...
*/
ods graphics / imagemap=ON TIPMAX=5000;
title "ESCAPE NEXT";
title "Fatility Rate: Deaths per Enrolled Subjects";
title2 "Sites with 2 or More Subjects";
footnote J=L "data cut off 15Apr2020";
proc sgplot data=FunnelMortality noautolegend;
where Cases=. OR Cases >= 2;
format Cases comma7.;
band x=N lower=L2sd upper=U2sd /
fill fillattrs=GraphConfidence outline lineattrs=GraphPrediction
transparency=0.5 legendlabel="95% limits" name="band95"
curvelabelloc=outside curvelabelupper="Upper Limit" curvelabellower="Lower Limit";
scatter x=Cases y=Proportion / legendlabel="County Proportion"
datalabel=Label tip=(SITENAME Cases Deaths);
refline &AvgProp / axis=y label="Study Proportion = &AvgProp"
SplitChar="/" name="ref"
lineattrs=GraphData2 label;
xaxis grid ;
yaxis grid Label="Fatality Rate";
run;
ods graphics / imagemap=OFF;
/**********************************************************************************************/
The example in the blog post had some counties for which the Trials was small and others for which Trials was as large as 1000. That is the reason for the convoluted definition for the nTrials vector. When the number of trials for each entity (county, school, hospital,....) are similar, you can use a regularly spaced vector of values beyween the minimum number of trials and the maximum number of trials. For your data, use
nTrials = T(2:20);
since Site 7 has N=2 cases and Site 1 has N=20 cases.
The example in the blog post had some counties for which the Trials was small and others for which Trials was as large as 1000. That is the reason for the convoluted definition for the nTrials vector. When the number of trials for each entity (county, school, hospital,....) are similar, you can use a regularly spaced vector of values beyween the minimum number of trials and the maximum number of trials. For your data, use
nTrials = T(2:20);
since Site 7 has N=2 cases and Site 1 has N=20 cases.
Hi Rick,
Thank you!
The code for study overall mortality works now. Now I am using the code of mortality to tweak and to make plot for Adverse Events and Serious Adverse Events.
I encountered the error again. The code example pasted in below.
Would you please have a look for me again, please!
Thank you!
/*************************/
data all_ae;
length Sitename Label $27 ;
input Sitename Cases AES Proportion Label;
datalines;
Site1 20 43 2.15 Site1
Site2 12 10 0.83333 Site2
Site3 2 3 1.5 Site3
Site4 7 5 0.71429 Site4
Site5 4 4 1 Site5
Site6 2 3 1.5 Site6
Site7 2 1 0.5 Site7
;
proc means data=all_ae P10 P25 P50 P75 P90 P95 P99 Max ndec=0;
var Cases;
run;
/*******************************/
%let DSName = all_ae;
%let Events = AES;
%let Trials = Cases;
proc sql ;
select sum(&Events) / sum(&Trials) format=best6. into :AvgProp
from &DSName;
quit;
%put &=&AvgProp;
proc print data=all_ae ;
where Cases >= 2;
run;
/* Isolate the computation of the the control limits for the funnel plot
so it can be reused in multiple programs.
*/
proc iml;
/* Funnel plot for proportion:
https://blogs.sas.com/content/iml/2011/11/23/funnel-plots-for-proportions.html
Compute binomial limits as a function of the nunber of trials.
Adjust the binomial quantiles according to method in Appendix A.1.1
of Spiegelhalter (2005) p. 1197.
These limits do not depend on the data, although we assume that theta
(often the average proportion from the data) is the probability parameter.
Parameters:
theta (scalar): empirical probability of event = sum(Events) / sum(Trials)
nTrials (column vector): sequence of trials at which to evaluate the binomial limits.
For example, nTrials = T(1:100)
prob (row vector): probabilities at which to compute quantiles.
For example, prob = {0.025 0.975} for 95% (~ 2 StdDev) limits
prob = {0.001 0.025 0.975 0.999} for 2 and 3 StdDev limits
*/
start BinomialFunnel(theta, nTrials, prob);
n = round(colvec(nTrials));
n = n <> 1; /* must be integer > 0 */
p = rowvec(prob);
/* compute columns of matrix, one for each CL */
limits = j(nrow(n), ncol(p), .);
do i = 1 to ncol(p);
r = quantile("binom", p[i], theta, n);
numer = cdf("binom", r , theta, n) - p[i];
denom = cdf("binom", r , theta, n) -
cdf("binom", r-1, theta, n);
alpha = numer / denom;
limits[,i] = (r-alpha)/n;
end;
/* we can't have negative limits */
idx = loc(limits < 0);
if ncol(idx)>0 then limits[idx]=0;
/* return these control limits */
return ( n || limits );
finish;
store module=BinomialFunnel;
QUIT;
/* Compute the funnels for the COVID-19 data */
proc iml;
use all_ae;
read all var "&Events" into Events;
read all var "&Trials" into Trials;
close;
/* compute overall proportion */
theta = sum(Events) / sum(Trials);
/* compute confidence limits for range of sample size */
load module=BinomialFunnel;
*nTrials = T(do(2,20,10)) // T( do(200, round(max(Trials)+20, 50), 50) );
nTrials = T(2:20);
prob = {0.025 0.975};
results = BinomialFunnel(theta, nTrials, prob);
labels = {"N" "L2sd" "U2sd"};
create Limits from results[colname=labels];
append from results;
close;
QUIT;
/* concatenate the data and the limits for the funnel plot */
data FunnelAE;
set all_ae(keep=SITENAME Cases AES Proportion Label)
Limits;
run;
ods graphics / imagemap=ON TIPMAX=5000;
title "ESCAPE NEXT";
title "Adverse Event Rate: Adverse Event per Enrolled Subjects";
footnote J=L "Data Cut off Date:" " &sysdate9..";
proc sgplot data=FunnelAE noautolegend;
where Cases=. OR Cases >= 2;
format Cases comma7.;
band x=N lower=L2sd upper=U2sd /
fill fillattrs=GraphConfidence outline lineattrs=GraphPrediction
transparency=0.5 legendlabel="95% limits" name="band95"
curvelabelloc=outside curvelabelupper="Upper Limit" curvelabellower="Lower Limit";
scatter x=Cases y=Proportion / legendlabel="Site Proportion"
datalabel=Label tip=(SITENAME Cases AES);
refline &AvgProp / axis=y label="Study Proportion = &AvgProp"
SplitChar="/" name="ref"
lineattrs=GraphData2 label;
xaxis grid ;
yaxis grid Label="Adverse Event Rate";
run;
ods graphics / imagemap=OFF;
The funnel plot is for binomial proportions, which are numbers between 0 and 1. In a binomial proportion, the number of events cannot be greater than the number of trials.
Your data has proportions that are greater than 1. For example, your first observation says there are 43 'events' out of 20 'trials'. This data cannot be modeled by a binomial proportion.
You can use a funnel plot for the MEAN number of AEs. The confidence intervals are computed differently, but the main ideas are the same:
https://blogs.sas.com/content/iml/2011/04/15/funnel-plots-an-alternative-to-ranking.html
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.
Find more tutorials on the SAS Users YouTube channel.