Re: Concordance Correlation Coefficient

Luke01 · Posted 12-07-2019 12:56 AM

Hi community,

Im having trouble trying to calculate Concordance Correlation Coefficient. How can i do this?

I am aware of: https://newonlinecourses.science.psu.edu/stat509/node/161/

but it seems to have old procedures that are not relevent for more recent SAS versions.

thanks in advance.

Ksharp · Posted 12-07-2019 05:44 AM

Did you try the FOUR kind of correlation coefficient in PROC CORR.

proc corr data=sashelp.class kendall;
var weight height;
run;

Luke01 · Posted 12-07-2019 07:27 AM

I am after lin's concordance correlation coefficient, not the same as correlation in terms of pearson's etc.

ie y=x.

PaigeMiller · Posted 12-07-2019 06:50 AM

@Luke01 look at the link I gave you in your other thread ... it was meant as a reference for you to refer back to as needed, not as a one time thing.

--
Paige Miller

Luke01 · Posted 12-07-2019 07:22 AM

yes it refers to kendels, i am after Lins.

FreelanceReinh · Posted 12-07-2019 08:16 AM

Hi @Luke01,

You can get the all the terms used in the formula from PROC CORR: Just use the COV and OUTP= options. The rest is a simple calculation, e.g., in a DATA step.

Example (assuming a dataset HAVE with numeric variables X and Y with only non-missing values):

proc corr data=have cov outp=stats noprint;
var x y;
run;

data want(keep=rc);
do until(last);
  set stats end=last;
  sxx+(_type_='COV')*(upcase(_name_)='X')*x;
  syy+(_type_='COV')*(upcase(_name_)='Y')*y;
  sxy+(_type_='COV')*(upcase(_name_)='X')*y;
  mx +(_type_='MEAN')*x;
  my +(_type_='MEAN')*y;
  n  +(_type_='N')*x;
end;
rc=2*sxy/(sxx+syy+(mx-my)**2);
run;

RC is the concordance correlation coefficient as per the link you provided.

Luke01 · Posted 12-07-2019 05:50 PM

Sorry for my ignorance im new to SAS and coding etc.

But what does sxx, my etc refer to?

Do i need to sub my x and y variables in?

Same for _name_

Just as I am getting 0 for rc which isnt right. Below is an example of what I did.

proc corr data=HAVE cov outp=stats noprint;
var x_variable y_variable;
run;

data want(keep=rc);
do until(last);
  set stats end=last;
  sxx+(_type_='COV')*(upcase(_name_)='X')*x_variable;
  syy+(_type_='COV')*(upcase(_name_)='Y')*y_variable;
  sxy+(_type_='COV')*(upcase(_name_)='X')*y_variable;
  mx +(_type_='MEAN')*x_variable;
  my +(_type_='MEAN')*y_variable;
  n  +(_type_='N')*x_variable;
end;
rc=2*sxy/(sxx+syy+(mx-my)*2);
run;

thank you

FreelanceReinh · Posted 12-07-2019 07:06 PM

@Luke01 wrote:

But what does sxx, my etc refer to?

These are just arbitrary variable names. I used names reflecting the terms in the formula (e.g. sxx for S_XX ). These variables are not kept in the output dataset (see the KEEP= option in the DATA statement -- but you may change that if you like), so their names don't really matter.

So i need to sub my x and y variables in?

You need to replace the dataset name have in the PROC CORR step with the name of your dataset. If the variables in your dataset for which you want to compute the concordance correlation coefficient are not named x and y, you would ideally use a RENAME= dataset option in the PROC CORR step. So, using the example program (see the link on the web page you linked to in your initial post) the PROC CORR step would read:

proc corr data=dice_baseline(rename=(cort_auc1=x cort_auc2=y)) cov outp=stats noprint;
var x y;
run;

(where stats is an arbitrary dataset name -- in case of a name conflict just use a different name).

Alternatively, let PROC CORR work with the original names and then replace "x" and "y" by these names in the DATA step in nine places, not only six as you did in your code: The character constants 'X' and 'Y' must also be replaced by 'X_VARIABLE' and 'Y_VARIABLE' (upper case is mandatory), respectively.

The DATA step reads the output dataset stats produced by PROC CORR, computes the concordance correlation coefficient and stores it in a numeric variable named rc in a dataset want containing only one record. (Both rc and want are arbitrary names.)

Same for _name_

The output dataset from PROC CORR contains character variables _type_ and _name_ (these are default names) which contain names of statistics and variables, respectively, and are used in the DATA step. I wrote the DATA step looking at PROC PRINT output of dataset stats:

proc print data=stats;
run;

tka726 · Posted 01-17-2020 11:31 AM

This worked perfectly for me, thank you!! Can the confidence interval for the CCC be obtained from this output as well?

FreelanceReinh · Posted 01-17-2020 05:52 PM

Hello @tka726,

Yes, in principle this is possible (e.g., the Pearson correlation coefficient from the PROC CORR output dataset stats would be involved in the calculation; also the TANH and ARTANH functions could be applied to simplify some terms). However, I found a couple of discrepancies between the formula used in the IML code from the psu.edu website linked by @Luke01 and a formula I saw elsewhere -- whereas the formulas for the point estimate were equivalent. Therefore, I'm hesitant to publish SAS code for the confidence interval, unless you provide a (link to a) reliable formula for it in mathematical notation like the formula for the point estimate on the psu.edu website.