Good morning to all,
As a beginner in SAS, I have a bit of trouble understanding how to calculate a Cohen's kappa when using directly a table containing the observations... Let me explain: in my table, I have two observers (_1 and _2) who have each rated a numerical value between 0 and 4 for 120 variables (X1, X2, X3...). So I have 240 columns X1_1, X1_2, X2_1, X2_2, X3_1, etc. In this case, how do I proceed with SAS to calculate the kappa?
Thank you in advance for your answers,
Just in case you (or a later reader of this thread) will need it: Here's an example of what I meant by "column-wise evaluation." The final dataset STATS contains one observation per item.
/* Code for the case of one subject per row and two columns per characteristic */
%let n=70; /* number of subjects */
%let c=120; /* number of characteristics ("items") */
%let m=4; /* ratings 0, 1, ..., m (or fewer) per characteristic */
/* Create sample data for demonstration */
%macro vars;
%do i=1 %to &c;
X&i._1 X&i._2
%end;
%mend vars;
data have(drop=_:);
call streaminit(27182818);
length id 8;
array c[&c] _temporary_; /* #categories - 1 for &c-th characteristic */
array v[&c,2] %vars;
if _n_=1 then do _i=1 to &c;
c[_i]=rand('integer',&m); /* ratings can vary from 0-1 to 0-&m */
end;
do id=1 to &n;
do _i=1 to &c;
do _r=1 to 2;
v[_i,_r]=rand('integer',0,c[_i]);
end;
end;
output;
end;
run;
/* Reshape the data */
proc transpose data=have out=_trans;
by id;
var x:;
run;
data _temp(drop=_:);
retain id;
length item $8 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;
proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by id item rater;
run;
proc transpose data=_temp out=_temp2(drop=_:) prefix=rater;
by id item;
var col1;
run;
proc sort data=_temp2;
by item id;
run;
proc freq data=_temp2 noprint;
by item;
tables rater1*rater2 / out=_freqs(drop=percent);
run;
data _allcomb(drop=_:);
length item $8;
do _item=1 to &c;
item=cats('X',_item);
do rater1=0 to &m;
do rater2=0 to &m;
output;
end;
end;
end;
run;
proc sort data=_allcomb;
by item rater1 rater2;
run;
data want;
merge _allcomb
_freqs(in=f);
by item rater1 rater2;
if ~f then count=0;
run;
proc sort data=want sortseq=linguistic(numeric_collation=on);
by item rater1 rater2;
run;
/* Compute Cohen's kappa (weighted kappa will be dropped) */
ods select none;
ods output KappaStatistics=kappa(where=(Statistic=:'S'));
proc freq data=want;
by item notsorted;
tables rater1*rater2 / agree;
weight count / zero;
run;
ods select all;
data stats;
set kappa(drop=table statistic rename=(value=Kappa));
label kappa="Cohen's kappa";
run;
proc print data=stats(obs=3) label noobs;
run;
Result (first 3 of 120 obs.):
95% Lower 95% Upper Cohen's Standard Confidence Confidence item kappa Error Limit Limit X1 -0.0300 0.0673 -0.1619 0.1019 X2 0.0373 0.0850 -0.1294 0.2039 X3 0.0628 0.0634 -0.0615 0.1871
1) Please post a sample of your data as a data step with INPUT statement as
a test data with real/semi-real values.
2) I'm not a statistician. Next is link to documentation of Kappa Statistics.
Can you transform your data to seem like the "Initial PC-ICD9 Records" table
presented on page 3. I believe you can understand the documentation better
then me. The link is:
https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/180-30.pdf
Hello @docfak and welcome to the SAS Support Communities!
Try this:
/* Create sample data for demonstration */
%macro vars;
%do i=1 %to 120;
X&i._1 X&i._2
%end;
%mend vars;
data have;
call streaminit(27182818);
array v[240] %vars;
do _n_=1 to dim(v);
v[_n_]=rand('integer',0,4);
end;
run;
/* Reshape the data */
proc transpose data=have out=_trans;
var x:;
run;
data _temp(drop=_:);
length item $4 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;
proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by item rater;
run;
proc transpose data=_temp out=want(drop=_:) prefix=rater;
by item;
var col1;
run;
/* Compute Cohen's kappa (and weighted kappa) */
proc freq data=want;
tables rater1*rater2 / agree;
ods select KappaStatistics;
run;
If the structure of your dataset is similar to that of the artificial dataset HAVE created above (in particular: the variables whose names start with "X" contain the ratings, the naming convention is as described in your post and there's only one observation), the subsequent steps (with "have" replaced by your dataset name) should be applicable to your data.
EDIT: In the extreme case that, e.g., one of the observers did not use a particular rating, say, rating 4, for any of the 120 items, the code would need to be amended so that ratings that don't occur are included with weight zero.
EDIT 2: Here is the amended code for the more general case:
/* Amended code for the more general case that not all possible rating categories occur */
/* Create sample data for demonstration */
%macro vars;
%do i=1 %to 120;
X&i._1 X&i._2
%end;
%mend vars;
data have;
call streaminit(27182818);
array v[240] %vars;
do _n_=1 to dim(v);
if mod(_n_,2) then v[_n_]=rand('integer',0,4);
else v[_n_]=rand('integer',0,3); /* --> 4 doesn't occur for rater 2 */
end;
run;
/* Reshape the data */
proc transpose data=have out=_trans;
var x:;
run;
data _temp(drop=_:);
length item $4 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;
proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by item rater;
run;
proc transpose data=_temp out=_temp2(drop=_:) prefix=rater;
by item;
var col1;
run;
proc freq data=_temp2 noprint;
tables rater1*rater2 / out=_freqs(drop=percent);
run;
data _allcomb;
do rater1=0 to 4;
do rater2=0 to 4;
output;
end;
end;
run;
data want;
merge _allcomb
_freqs(in=f);
by rater1 rater2;
if ~f then count=0;
run;
/* Compute Cohen's kappa (and weighted kappa) */
proc freq data=want;
tables rater1*rater2 / agree;
weight count / zero;
ods select KappaStatistics;
run;
Many thanks!!!
Well it is the case for some observations (there are around 70). Moreover, some variables are rated 0-1, some 0-3 and some 0-4...
Thanks again!
@docfak wrote:
Well it is the case for some observations (there are around 70). Moreover, some variables are rated 0-1, some 0-3 and some 0-4...
Are you saying that "around 70" subjects (or objects) were rated with regard to 120 characteristics? In this case you would primarily want to compute Cohen's kappa separately for each characteristic, wouldn't you? Then different scales (like 0-1 vs. 0-3) would not be mixed. Or are you after a measure of some sort of "overall agreement" across all 120 variables? If so, how would this be defined (cf. documentation Tests and Measures of Agreement)?
I think my code (both versions) would be most appropriate if 120 subjects were rated with regard to one characteristic so that dataset HAVE has one observation with variable Xi_j containing the rating of rater j (j=1, 2) for subject i (i=1, ..., 120). It could be generalized further to handle more than one characteristic (separately), leading to a row-wise evaluation. However, your latest description rather seems to ask for a column-wise evaluation.
Just in case you (or a later reader of this thread) will need it: Here's an example of what I meant by "column-wise evaluation." The final dataset STATS contains one observation per item.
/* Code for the case of one subject per row and two columns per characteristic */
%let n=70; /* number of subjects */
%let c=120; /* number of characteristics ("items") */
%let m=4; /* ratings 0, 1, ..., m (or fewer) per characteristic */
/* Create sample data for demonstration */
%macro vars;
%do i=1 %to &c;
X&i._1 X&i._2
%end;
%mend vars;
data have(drop=_:);
call streaminit(27182818);
length id 8;
array c[&c] _temporary_; /* #categories - 1 for &c-th characteristic */
array v[&c,2] %vars;
if _n_=1 then do _i=1 to &c;
c[_i]=rand('integer',&m); /* ratings can vary from 0-1 to 0-&m */
end;
do id=1 to &n;
do _i=1 to &c;
do _r=1 to 2;
v[_i,_r]=rand('integer',0,c[_i]);
end;
end;
output;
end;
run;
/* Reshape the data */
proc transpose data=have out=_trans;
by id;
var x:;
run;
data _temp(drop=_:);
retain id;
length item $8 rater $1;
set _trans;
item =scan(_name_,1,'_');
rater=scan(_name_,2,'_');
run;
proc sort data=_temp sortseq=linguistic(numeric_collation=on);
by id item rater;
run;
proc transpose data=_temp out=_temp2(drop=_:) prefix=rater;
by id item;
var col1;
run;
proc sort data=_temp2;
by item id;
run;
proc freq data=_temp2 noprint;
by item;
tables rater1*rater2 / out=_freqs(drop=percent);
run;
data _allcomb(drop=_:);
length item $8;
do _item=1 to &c;
item=cats('X',_item);
do rater1=0 to &m;
do rater2=0 to &m;
output;
end;
end;
end;
run;
proc sort data=_allcomb;
by item rater1 rater2;
run;
data want;
merge _allcomb
_freqs(in=f);
by item rater1 rater2;
if ~f then count=0;
run;
proc sort data=want sortseq=linguistic(numeric_collation=on);
by item rater1 rater2;
run;
/* Compute Cohen's kappa (weighted kappa will be dropped) */
ods select none;
ods output KappaStatistics=kappa(where=(Statistic=:'S'));
proc freq data=want;
by item notsorted;
tables rater1*rater2 / agree;
weight count / zero;
run;
ods select all;
data stats;
set kappa(drop=table statistic rename=(value=Kappa));
label kappa="Cohen's kappa";
run;
proc print data=stats(obs=3) label noobs;
run;
Result (first 3 of 120 obs.):
95% Lower 95% Upper Cohen's Standard Confidence Confidence item kappa Error Limit Limit X1 -0.0300 0.0673 -0.1619 0.1019 X2 0.0373 0.0850 -0.1294 0.2039 X3 0.0628 0.0634 -0.0615 0.1871
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.