******************************************************************************* * This program indicates how to construct a bivariate scatterplot with an * * overlay of the line of identity. * * * * This program also provides an IML module for calculating point and * * interval estimates of the Pearson correlation coefficient and the * * concordance correlation coefficient. * *******************************************************************************; data dice_baseline; input subject cort_auc1 cort_auc2; cards; 61001 5.28170 5.37851 61002 5.58796 5.33628 61003 6.47607 6.59770 61005 6.36019 6.39746 61007 5.81121 5.82528 61008 6.03036 6.21147 61009 5.84549 6.10434 61014 6.80349 6.90689 61015 6.28977 6.27369 61023 5.88446 5.94352 61024 5.79701 6.04876 61026 5.01302 4.72154 61027 6.48824 6.37891 61028 5.30862 5.53405 61033 5.70905 5.77803 61034 5.98545 5.77613 61035 6.19924 6.20880 61036 6.00639 5.98313 61037 6.27793 6.44342 61038 6.57390 6.62936 61039 5.69639 5.43509 61042 5.96588 6.01282 61044 6.04803 6.11529 61046 6.39423 6.03876 62002 7.44584 7.58421 62003 5.90813 5.87230 62004 6.05483 5.94695 62005 5.65735 5.64983 62006 6.44815 6.44280 62007 6.28611 6.45374 62009 6.40863 6.14994 62012 5.62564 5.58142 62013 6.68375 6.58815 62014 5.76951 5.84802 62015 5.94383 5.95489 62016 5.66024 5.73711 62017 4.77492 4.57465 62018 5.60468 5.43495 62019 6.82819 6.86652 62020 5.18986 5.05725 62021 6.48810 6.59655 62022 6.08867 5.76965 62023 5.91400 5.89672 62024 5.58217 5.57651 62026 6.32857 6.48921 62027 7.67703 7.76541 62028 5.92411 5.66689 62029 6.15313 6.16558 62030 5.20392 5.42481 62032 6.43962 6.46171 62033 6.20661 6.28542 63001 6.04767 5.82528 63002 6.46923 6.51728 63003 5.68370 5.79701 63004 5.11719 5.47363 63005 6.10993 6.13541 63006 4.91744 5.06968 63008 5.35972 5.56605 63010 6.73016 6.76285 63011 5.93700 5.94092 63012 6.07716 5.92548 63014 6.58185 6.52781 63015 5.84317 5.85030 63016 5.98144 6.22389 63017 5.77452 5.81662 63018 5.46142 5.84898 63021 5.44920 5.43688 63022 5.96519 6.01302 63026 6.29258 6.42339 63027 6.86608 6.92806 63028 5.47875 5.72634 63030 6.16190 6.16608 63032 5.66707 5.97114 63033 5.80634 5.63640 63034 6.37256 6.24416 63035 5.65755 5.98070 64001 6.07596 6.06010 64002 6.57898 6.54552 64003 6.60733 6.89724 64004 5.69611 5.82963 64005 6.60331 6.62972 64007 5.89963 5.83195 64008 6.19731 6.07044 64009 5.88875 6.03427 64010 5.64912 5.46135 64011 6.99962 7.10324 64012 6.61282 6.68520 64015 6.60477 6.76468 64016 5.35161 5.55307 64017 6.63249 6.77868 64020 5.37717 5.19760 64023 5.58781 5.87044 64025 5.74499 5.77090 64027 5.92655 6.09265 64028 5.01060 5.17112 64030 7.12400 7.17368 64031 5.79909 5.37603 64032 5.75609 5.85174 64033 6.79275 6.78255 64035 6.02198 6.03915 64036 5.43960 5.82229 64037 5.20163 5.12928 65001 6.18838 6.66593 65002 6.13860 6.26224 65003 6.98807 6.97658 65005 5.54628 5.50547 65006 4.47249 4.59256 65007 5.04034 5.07775 65008 5.42025 5.46227 65009 5.26772 5.41463 65011 6.43019 6.38438 65013 6.56323 6.46607 65014 5.06134 4.70619 65016 6.32666 6.31564 65017 5.90235 6.05890 65018 6.05800 5.99467 65020 5.96388 6.01204 65021 5.57324 5.61324 65022 6.21017 6.15262 65025 6.01934 6.00318 65026 5.52082 5.54575 65027 5.89237 5.67469 65028 6.24592 6.37106 65031 6.31524 6.44334 65032 6.19602 6.29576 65033 6.07305 6.07966 65037 5.60960 5.38492 65038 5.39806 5.18466 65040 5.95464 6.20802 66003 5.96880 5.86128 66004 5.89707 5.69116 66005 6.27067 6.35294 66006 5.39744 5.23236 66010 6.64408 6.64990 66011 6.29430 6.32724 66012 5.22507 5.28754 66013 5.15840 5.05580 66014 5.85385 5.54974 66018 6.49611 5.87795 66019 5.43285 5.50871 66021 6.32416 6.31612 66023 5.77253 5.74469 66024 5.89920 5.95774 ; run; proc means data=dice_baseline noprint; var cort_auc1 cort_auc2; output out=dice_baselinemin min=var1 var2; run; proc means data=dice_baseline noprint; var cort_auc1 cort_auc2; output out=dice_baselinemax max=var1 var2; run; data dice_baselineall; set dice_baselinemin dice_baselinemax; drop _type_ _freq_; if _N_=1 then var1=floor(min(var1,var2)); if _N_=1 then var2=floor(min(var1,var2)); if _N_=2 then var1=ceil(max(var1,var2)); if _N_=2 then var2=ceil(max(var1,var2)); run; data dice_baselineall; set dice_baseline dice_baselineall; run; proc gplot uniform data=dice_baselineall; plot cort_auc1*cort_auc2 var1*var2/overlay vaxis=axis1 haxis=axis2 nolegend frame; axis1 label=(a=90 'Cortisol Every Hour') minor=none; axis2 label=('Cortisol Every Two Hours') minor=none; symbol1 value=star color=black interpol=none; symbol2 value=none color=black interpol=join; title "Concordance Correlation Coefficient"; run; proc iml; ******************************************************************************* * Enter the appropriate SAS data set name in the use statement and enter the * * appropriate variable names in the read statements. * *******************************************************************************; use dice_baseline; read all var {cort_auc1} into var1; read all var {cort_auc2} into var2; ******************************************************************************* * The IML module, labeled concorr, starts next. * *******************************************************************************; start concorr; nonmiss=loc(var1#var2^=.); var1=var1[nonmiss]; var2=var2[nonmiss]; free nonmiss; n=nrow(var1); mu1=sum(var1)/n; mu1=round(mu1,0.0001); mu2=sum(var2)/n; mu2=round(mu2,0.0001); sigma11=ssq(var1-mu1)/(n-1); sigma11=round(sigma11,0.0001); sigma22=ssq(var2-mu2)/(n-1); sigma22=round(sigma22,0.0001); sigma12=sum((var1-mu1)#(var2-mu2))/(n-1); sigma12=round(sigma12,0.0001); lshift=(mu1-mu2)/((sigma11#sigma22)##0.25); rho=sigma12/sqrt(sigma11#sigma22); rho=round(rho,0.0001); z=log((1+rho)/(1-rho))/2; se_z=1/sqrt(n-3); t=tinv(0.975,n-3); z_low=z-(se_z#t); z_upp=z+(se_z#t); rho_low=(exp(2#z_low)-1)/(exp(2#z_low)+1); rho_low=round(rho_low,0.0001); rho_upp=(exp(2#z_upp)-1)/(exp(2#z_upp)+1); rho_upp=round(rho_upp,0.0001); crho=(2#sigma12)/((sigma11+sigma22)+((mu1-mu2)##2)); crho=round(crho,0.0001); z=log((1+crho)/(1-crho))/2; if sigma12^=0 then do; t1=((1-(rho##2))#(crho##2))/((1-(crho##2))#(rho##2)); t2=(2#(crho##3)#(1-crho)#(lshift##2))/(rho#((1-(crho##2))##2)); t3=((crho##4)#(lshift##4))/(2#(rho##2)#((1-(crho##2))##2)); se_z=sqrt((t1+t2-t3)/(n-2)); end; else se_z=sqrt(2#sigma11#sigma22)/((sigma11+sigma22+((mu1-mu2)##2))#(n-2)); t=tinv(0.975,n-2); z_low=z-(se_z#t); z_upp=z+(se_z#t); crho_low=(exp(2#z_low)-1)/(exp(2#z_low)+1); crho_low=round(crho_low,0.0001); crho_upp=(exp(2#z_upp)-1)/(exp(2#z_upp)+1); crho_upp=round(crho_upp,0.0001); Results=n//mu1//mu2//sigma11//sigma22//sigma12//rho_low//rho//rho_upp// crho_low//crho//crho_upp; r_name={'SampleSize' 'Mean_1' 'Mean_2' 'Variance_1' 'Variance_2' 'Covariance' 'Corr LowerCL' 'Corr' 'Corr UpperCL' 'ConcCorr LowerCL' 'ConcCorr' 'ConcCorr UpperCL'}; print 'The Estimated Correlation and Concordance Correlation (and 95% Confidence Limits)'; print Results [rowname=r_name]; finish concorr; ******************************************************************************* * The IML module, labeled concorr, is finished. * *******************************************************************************; run concorr;
The posted code looks old. It uses PROC GPLOT, and does not use the built-in descriptive statistics (MEAN, COV) in IML. More importantly, the IML code rounds the computed values to 1e-4, rather than computing with the full-precision estimates.
The following code is a more modern alternative:
title "Concordance Correlation Coefficient";
proc sgplot data=dice_baseline noautolegend;
scatter y=cort_auc1 x=cort_auc2;
lineparm x=0 y=0 slope=1 / clip;
yaxis label='Cortisol Every Hour' min=4 max=8;
xaxis label='Cortisol Every Two Hours' min=4 max=8;
run;
proc iml;
*******************************************************************************
* Enter the appropriate SAS data set name in the use statement and enter the *
* appropriate variable names in the read statements. *
*******************************************************************************;
use dice_baseline;
read all var {cort_auc1} into var1;
read all var {cort_auc2} into var2;
start concor(x1, x2);
nonmiss=loc(x1#x2^=.);
var1=x1[nonmiss];
var2=x2[nonmiss];
free nonmiss;
n=nrow(var1);
mu1=mean(var1);
mu2=mean(var2);
Sigma = cov( var1||var2 );
sigma11=Sigma[1,1];
sigma22=Sigma[2,2];
sigma12=Sigma[1,2];
lshift=(mu1-mu2)/((sigma11#sigma22)##0.25);
rho=sigma12/sqrt(sigma11#sigma22);
z=log((1+rho)/(1-rho))/2;
se_z=1/sqrt(n-3);
t=tinv(0.975,n-3);
z_low=z-(se_z#t);
z_upp=z+(se_z#t);
rho_low=(exp(2#z_low)-1)/(exp(2#z_low)+1);
rho_upp=(exp(2#z_upp)-1)/(exp(2#z_upp)+1);
crho=(2#sigma12)/((sigma11+sigma22)+((mu1-mu2)##2));
z=log((1+crho)/(1-crho))/2;
if sigma12^=0 then do;
t1=((1-(rho##2))#(crho##2))/((1-(crho##2))#(rho##2));
t2=(2#(crho##3)#(1-crho)#(lshift##2))/(rho#((1-(crho##2))##2));
t3=((crho##4)#(lshift##4))/(2#(rho##2)#((1-(crho##2))##2));
se_z=sqrt((t1+t2-t3)/(n-2));
end;
else se_z=sqrt(2#sigma11#sigma22)/((sigma11+sigma22+((mu1-mu2)##2))#(n-2));
t=tinv(0.975,n-2);
z_low=z-(se_z#t);
z_upp=z+(se_z#t);
crho_low=(exp(2#z_low)-1)/(exp(2#z_low)+1);
crho_upp=(exp(2#z_upp)-1)/(exp(2#z_upp)+1);
Results=n//mu1//mu2//sigma11//sigma22//sigma12//rho_low//rho//rho_upp//
crho_low//crho//crho_upp;
r_name={'SampleSize' 'Mean_1' 'Mean_2' 'Variance_1' 'Variance_2' 'Covariance' 'Corr LowerCL'
'Corr' 'Corr UpperCL' 'ConcCorr LowerCL' 'ConcCorr' 'ConcCorr UpperCL'};
print Results [rowname=r_name format=best6. L='Estimated Correlation and Concordance Correlation'];
finish concor;
run concor(var1, var2);
Most of this seems to work, however I'm getting an error message with this part:
proc gplot uniform data=dice_baselineall;
plot BI_Nex_MAP*BI_A_MAP var1*var2/overlay vaxis=axis1 haxis=axis2 nolegend frame;
axis1 label=(a=90 'BI Nex MAP') minor=none;
axis2 label=('BI A line MAP') minor=none;
symbol1 value=star color=black interpol=none;
symbol2 value=none color=black interpol=join;
title "Concordance Correlation Coefficient";
run;
The proc IML seems to work just fine though.
Are you running the code in the SAS University Edition? PROC GPLOT is an old plotting routine that is not supported in SAS UE. You can convert that steo to use PROC SGPLOT.
Look at PROC TIMESERIES and the OUTCORR option that looks at auto correlation and partial correlations.
PROC ARIMA / AUTOREG should also be helpful.
Thanks Reeza for your suggestions. I have had this through a statistician, since I am also a bit rusty on that point, and they suggested this analysis
I used the code outlined here:
https://onlinecourses.science.psu.edu/stat509/node/161/
However, one of my colleagues pointed out that whilst this would be OK if we had one measurement (Blood Pressure) per patient, the confidence intervals suggested by the above code would be incorrect of there was within subject correlation. Which is understandable as we are measuring BP's in different ways on the same patients. Similarly if we have repeated measurements, then bootstrapping with replacement would work.
Going to take a look at Efron's 1994 book on introduction to bootstrapping but is there a way to code for this or adjust the above code to make it work for repeated measures in the same individuals or different ways of measuring the same thing simultaneously in the same patients.
Any suggestions?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.