BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
docfak
Fluorite | Level 6

Good morning to all,

As a beginner in SAS, I have a bit of trouble understanding how to calculate a Cohen's kappa when using directly a table containing the observations... Let me explain: in my table, I have two observers (_1 and _2) who have each rated a numerical value between 0 and 4 for 120 variables (X1, X2, X3...). So I have 240 columns X1_1, X1_2, X2_1, X2_2, X3_1, etc. In this case, how do I proceed with SAS to calculate the kappa?

Thank you in advance for your answers,

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Just in case you (or a later reader of this thread) will need it: Here's an example of what I meant by "column-wise evaluation." The final dataset STATS contains one observation per item.

/* Code for the case of one subject per row and two columns per characteristic */

%let n=70;  /* number of subjects */
%let c=120; /* number of characteristics ("items") */
%let m=4;   /* ratings 0, 1, ..., m (or fewer) per characteristic */

/* Create sample data for demonstration */

%macro vars;
%do i=1 %to &c;
  X&i._1 X&i._2
%end;
%mend vars;

data have(drop=_:);
call streaminit(27182818);
length id 8;
array c[&c] _temporary_; /* #categories - 1 for &c-th characteristic */
array v[&c,2] %vars;
if _n_=1 then do _i=1 to &c;
  c[_i]=rand('integer',&m); /* ratings can vary from 0-1 to 0-&m */
end;
do id=1 to &n;
  do _i=1 to &c;
    do _r=1 to 2;
      v[_i,_r]=rand('integer',0,c[_i]);
    end;
  end;
  output;
end;
run;


/* Reshape the data */

proc transpose data=have out=_trans;
by id;
var x:;
run;

data _temp(drop=_:);
retain id;
length item $8 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;

proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by id item rater;
run;

proc transpose data=_temp out=_temp2(drop=_:) prefix=rater;
by id item;
var col1;
run;

proc sort data=_temp2;
by item id;
run;

proc freq data=_temp2 noprint;
by item;
tables rater1*rater2 / out=_freqs(drop=percent);
run;

data _allcomb(drop=_:);
length item $8;
do _item=1 to &c;
  item=cats('X',_item);
  do rater1=0 to &m;
    do rater2=0 to &m;
      output;
    end;
  end;
end;
run;

proc sort data=_allcomb;
by item rater1 rater2;
run;

data want;
merge _allcomb
      _freqs(in=f);
by item rater1 rater2;
if ~f then count=0;
run;

proc sort data=want sortseq=linguistic(numeric_collation=on);
by item rater1 rater2;
run;


/* Compute Cohen's kappa (weighted kappa will be dropped) */

ods select none;
ods output KappaStatistics=kappa(where=(Statistic=:'S'));
proc freq data=want;
by item notsorted;
tables rater1*rater2 / agree;
weight count / zero;
run;
ods select all;

data stats;
set kappa(drop=table statistic rename=(value=Kappa));
label kappa="Cohen's kappa";
run;

proc print data=stats(obs=3) label noobs;
run;

Result (first 3 of 120 obs.):

                                95% Lower     95% Upper
         Cohen's    Standard    Confidence    Confidence
item       kappa     Error        Limit         Limit

 X1      -0.0300      0.0673      -0.1619        0.1019
 X2       0.0373      0.0850      -0.1294        0.2039
 X3       0.0628      0.0634      -0.0615        0.1871

 

View solution in original post

6 REPLIES 6
Shmuel
Garnet | Level 18

1) Please post a sample of your data as a data step with INPUT statement as

    a test data with real/semi-real values.

 

2) I'm not a statistician. Next is link to documentation of Kappa Statistics.

    Can you transform your data to seem like the "Initial PC-ICD9 Records" table 

    presented on page 3. I believe you can understand the documentation better

   then me. The link is:

https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/180-30.pdf 

FreelanceReinh
Jade | Level 19

Hello @docfak and welcome to the SAS Support Communities!

 

Try this:

/* Create sample data for demonstration */

%macro vars;
%do i=1 %to 120;
  X&i._1 X&i._2
%end;
%mend vars;

data have;
call streaminit(27182818);
array v[240] %vars;
do _n_=1 to dim(v);
  v[_n_]=rand('integer',0,4);
end;
run;


/* Reshape the data */

proc transpose data=have out=_trans;
var x:;
run;

data _temp(drop=_:);
length item $4 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;

proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by item rater;
run;

proc transpose data=_temp out=want(drop=_:) prefix=rater;
by item;
var col1;
run;


/* Compute Cohen's kappa (and weighted kappa) */

proc freq data=want;
tables rater1*rater2 / agree;
ods select KappaStatistics;
run;

If the structure of your dataset is similar to that of the artificial dataset HAVE created above (in particular: the variables whose names start with "X" contain the ratings, the naming convention is as described in your post and there's only one observation), the subsequent steps (with "have" replaced by your dataset name) should be applicable to your data.

 

EDIT: In the extreme case that, e.g., one of the observers did not use a particular rating, say, rating 4, for any of the 120 items, the code would need to be amended so that ratings that don't occur are included with weight zero.

 

EDIT 2: Here is the amended code for the more general case:

/* Amended code for the more general case that not all possible rating categories occur */

/* Create sample data for demonstration */

%macro vars;
%do i=1 %to 120;
  X&i._1 X&i._2
%end;
%mend vars;

data have;
call streaminit(27182818);
array v[240] %vars;
do _n_=1 to dim(v);
  if mod(_n_,2) then  v[_n_]=rand('integer',0,4);
  else v[_n_]=rand('integer',0,3); /* --> 4 doesn't occur for rater 2 */
end;
run;


/* Reshape the data */

proc transpose data=have out=_trans;
var x:;
run;

data _temp(drop=_:);
length item $4 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;

proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by item rater;
run;

proc transpose data=_temp out=_temp2(drop=_:) prefix=rater;
by item;
var col1;
run;

proc freq data=_temp2 noprint;
tables rater1*rater2 / out=_freqs(drop=percent);
run;

data _allcomb;
do rater1=0 to 4;
  do rater2=0 to 4;
    output;
  end;
end;
run;

data want;
merge _allcomb
      _freqs(in=f);
by rater1 rater2;
if ~f then count=0;
run;


/* Compute Cohen's kappa (and weighted kappa) */

proc freq data=want;
tables rater1*rater2 / agree;
weight count / zero;
ods select KappaStatistics;
run;
docfak
Fluorite | Level 6

Many thanks!!!

Well it is the case for some observations (there are around 70). Moreover, some variables are rated 0-1, some 0-3 and some 0-4...

 

Thanks again!

FreelanceReinh
Jade | Level 19

@docfak wrote:

Well it is the case for some observations (there are around 70). Moreover, some variables are rated 0-1, some 0-3 and some 0-4...


Are you saying that "around 70" subjects (or objects) were rated with regard to 120 characteristics? In this case you would primarily want to compute Cohen's kappa separately for each characteristic, wouldn't you? Then different scales (like 0-1 vs. 0-3) would not be mixed. Or are you after a measure of some sort of "overall agreement" across all 120 variables? If so, how would this be defined (cf. documentation Tests and Measures of Agreement)?

 

I think my code (both versions) would be most appropriate if 120 subjects were rated with regard to one characteristic so that dataset HAVE has one observation with variable Xi_j containing the rating of rater j (j=1, 2) for subject i (i=1, ..., 120). It could be generalized further to handle more than one characteristic (separately), leading to a row-wise evaluation. However, your latest description rather seems to ask for a column-wise evaluation.

FreelanceReinh
Jade | Level 19

Just in case you (or a later reader of this thread) will need it: Here's an example of what I meant by "column-wise evaluation." The final dataset STATS contains one observation per item.

/* Code for the case of one subject per row and two columns per characteristic */

%let n=70;  /* number of subjects */
%let c=120; /* number of characteristics ("items") */
%let m=4;   /* ratings 0, 1, ..., m (or fewer) per characteristic */

/* Create sample data for demonstration */

%macro vars;
%do i=1 %to &c;
  X&i._1 X&i._2
%end;
%mend vars;

data have(drop=_:);
call streaminit(27182818);
length id 8;
array c[&c] _temporary_; /* #categories - 1 for &c-th characteristic */
array v[&c,2] %vars;
if _n_=1 then do _i=1 to &c;
  c[_i]=rand('integer',&m); /* ratings can vary from 0-1 to 0-&m */
end;
do id=1 to &n;
  do _i=1 to &c;
    do _r=1 to 2;
      v[_i,_r]=rand('integer',0,c[_i]);
    end;
  end;
  output;
end;
run;


/* Reshape the data */

proc transpose data=have out=_trans;
by id;
var x:;
run;

data _temp(drop=_:);
retain id;
length item $8 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;

proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by id item rater;
run;

proc transpose data=_temp out=_temp2(drop=_:) prefix=rater;
by id item;
var col1;
run;

proc sort data=_temp2;
by item id;
run;

proc freq data=_temp2 noprint;
by item;
tables rater1*rater2 / out=_freqs(drop=percent);
run;

data _allcomb(drop=_:);
length item $8;
do _item=1 to &c;
  item=cats('X',_item);
  do rater1=0 to &m;
    do rater2=0 to &m;
      output;
    end;
  end;
end;
run;

proc sort data=_allcomb;
by item rater1 rater2;
run;

data want;
merge _allcomb
      _freqs(in=f);
by item rater1 rater2;
if ~f then count=0;
run;

proc sort data=want sortseq=linguistic(numeric_collation=on);
by item rater1 rater2;
run;


/* Compute Cohen's kappa (weighted kappa will be dropped) */

ods select none;
ods output KappaStatistics=kappa(where=(Statistic=:'S'));
proc freq data=want;
by item notsorted;
tables rater1*rater2 / agree;
weight count / zero;
run;
ods select all;

data stats;
set kappa(drop=table statistic rename=(value=Kappa));
label kappa="Cohen's kappa";
run;

proc print data=stats(obs=3) label noobs;
run;

Result (first 3 of 120 obs.):

                                95% Lower     95% Upper
         Cohen's    Standard    Confidence    Confidence
item       kappa     Error        Limit         Limit

 X1      -0.0300      0.0673      -0.1619        0.1019
 X2       0.0373      0.0850      -0.1294        0.2039
 X3       0.0628      0.0634      -0.0615        0.1871

 

docfak
Fluorite | Level 6
Many thanks!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 1616 views
  • 5 likes
  • 3 in conversation