Statistic to measure percentage of agreement between dichotomous variables

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Statistic to measure percentage of agreement between dichotomous variables

Hi friends,

I have a dataset that contains 5 dichotomous (yes/no) variables: we will call it SUPER_DATA:

 

t1 t2 t3 t4 t5

1  0  1  0  1

1  0  0  1  1

.

.

.

.

.

etc.

 

I am trying to write some macro code that will produce a comparison between each of the variables.

The comparison will give a percentage of how often the two variables "agree."

So if we are comparing t1 and t2, i wish to compute the following:

 

s = (count(t1 = 1 and t2= 1) or count(t1=0 and t2 = 0)) / total number of observations.

(i.e. percentage of the time both variables are 1 or 0)

 

for each of the variables.

The macro will compute this statistic for all 25 comparisons (5*5).

 

The approach i have tried so far is creating a dataset for each comparison that only contains the observations from the two comparison variables thatt agree. Then i was planning on somehow counting the observations in each dataset and dividing by the size of the original dataset. However i couldnt get this to work, and o feel like there would be a better way to do this.

 

Any help is much appreciated.

Cheers.


Accepted Solutions
Solution
‎05-18-2016 10:36 PM
Super Contributor
Posts: 308

Re: Statistic to measure percentage of agreement between dichotomous variables

[ Edited ]

Hello,

 

No macro needed. This may be a good start:

 

data have;
input t1 t2 t3 t4 t5;
datalines;
1  0  1  0  1
1  0  0  1  1
;

data want;
set have nobs=totalobs;
array t{*} t1--t5;
/*some automation needed here */
array countt{*} t1t2 t1t3 t1t4 t1t5 t2t3 t2t4 t2t5 t3t4 t3t5 t4t5;

counterperc=1;

do i=1 to dim(t);
countervars=i+1;
 do j=counterperc to dim(countt) while (countervars le dim(t)) ;
  countt{j} = (t{i}=t{countervars}) / totalobs;
  countervars=countervars+1;
  counterperc=counterperc+1;
 end;
end;

drop countervars counterperc j i; run;

View solution in original post


All Replies
Solution
‎05-18-2016 10:36 PM
Super Contributor
Posts: 308

Re: Statistic to measure percentage of agreement between dichotomous variables

[ Edited ]

Hello,

 

No macro needed. This may be a good start:

 

data have;
input t1 t2 t3 t4 t5;
datalines;
1  0  1  0  1
1  0  0  1  1
;

data want;
set have nobs=totalobs;
array t{*} t1--t5;
/*some automation needed here */
array countt{*} t1t2 t1t3 t1t4 t1t5 t2t3 t2t4 t2t5 t3t4 t3t5 t4t5;

counterperc=1;

do i=1 to dim(t);
countervars=i+1;
 do j=counterperc to dim(countt) while (countervars le dim(t)) ;
  countt{j} = (t{i}=t{countervars}) / totalobs;
  countervars=countervars+1;
  counterperc=counterperc+1;
 end;
end;

drop countervars counterperc j i; run;
New Contributor
Posts: 4

Re: Statistic to measure percentage of agreement between dichotomous variables

Thanks for this mate, along with this and using a proc summary as suggested by @FreelanceReinhard i've got it working perfectly.

Cheers.

Super User
Posts: 19,815

Re: Statistic to measure percentage of agreement between dichotomous variables

You have a bit of comparisons. The data step code is more efficient but I thought this may b a useful read since it sounds like an agreement statistic, which is a proc freq. 

 

http://support.sas.com/kb/24/170.html

Super User
Posts: 10,030

Re: Statistic to measure percentage of agreement between dichotomous variables

It is more convenient for IML code .

What kind of output do you want ?

New Contributor
Posts: 4

Re: Statistic to measure percentage of agreement between dichotomous variables

Something like this:

t1t2 t1t3 t1t4 .....
0.6 0.7 0.8 ......

So just one row with each column representing a comparison between two of
the variables.


##- Please type your reply above this line. Simple formatting, no
attachments. -##
Trusted Advisor
Posts: 1,117

Re: Statistic to measure percentage of agreement between dichotomous variables

[ Edited ]

That's what @Loko's data step was made for. Just summarize the WANT dataset:

 

proc summary data=want;
var t1t2--t4t5;
output out=agreement(drop=_:) sum=;
run;

 

 

Super User
Posts: 10,030

Re: Statistic to measure percentage of agreement between dichotomous variables

OK . Here is.

data have;
input t1 t2 t3 t4 t5;
datalines;
1  0  1  0  1
1  0  0  1  1
;
run;
proc transpose data=have(obs=0) out=temp1;run;
proc transpose data=temp1 out=temp2(drop=_name_ _label_);
 var _name_;
run;
data _null_;
set temp2 end=last;
array x{*} $ _character_;
if _n_=1 then call execute('proc sql;create table want as select ');
do i=1 to dim(x)-1;
 do j=i+1 to dim(x);
  call execute(catx(' ','sum(',x{i},'=',x{j},')/count(*) as ',cats(x{i},'_',x{j})));
  if not (i=dim(x)-1 and j=dim(x)) then call execute(',');
 end;
end;
if last then call execute('from have;quit;');
run;



☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 481 views
  • 5 likes
  • 5 in conversation