Solved
New Contributor
Posts: 4

# Statistic to measure percentage of agreement between dichotomous variables

Hi friends,

I have a dataset that contains 5 dichotomous (yes/no) variables: we will call it SUPER_DATA:

t1 t2 t3 t4 t5

1  0  1  0  1

1  0  0  1  1

.

.

.

.

.

etc.

I am trying to write some macro code that will produce a comparison between each of the variables.

The comparison will give a percentage of how often the two variables "agree."

So if we are comparing t1 and t2, i wish to compute the following:

s = (count(t1 = 1 and t2= 1) or count(t1=0 and t2 = 0)) / total number of observations.

(i.e. percentage of the time both variables are 1 or 0)

for each of the variables.

The macro will compute this statistic for all 25 comparisons (5*5).

The approach i have tried so far is creating a dataset for each comparison that only contains the observations from the two comparison variables thatt agree. Then i was planning on somehow counting the observations in each dataset and dividing by the size of the original dataset. However i couldnt get this to work, and o feel like there would be a better way to do this.

Any help is much appreciated.

Cheers.

Accepted Solutions
Solution
‎05-18-2016 10:36 PM
Super Contributor
Posts: 319

## Re: Statistic to measure percentage of agreement between dichotomous variables

[ Edited ]

Hello,

No macro needed. This may be a good start:

```data have;
input t1 t2 t3 t4 t5;
datalines;
1  0  1  0  1
1  0  0  1  1
;

data want;
set have nobs=totalobs;
array t{*} t1--t5;
/*some automation needed here */
array countt{*} t1t2 t1t3 t1t4 t1t5 t2t3 t2t4 t2t5 t3t4 t3t5 t4t5;

counterperc=1;

do i=1 to dim(t);
countervars=i+1;
do j=counterperc to dim(countt) while (countervars le dim(t)) ;
countt{j} = (t{i}=t{countervars}) / totalobs;
countervars=countervars+1;
counterperc=counterperc+1;
end;
end;
drop countervars counterperc j i;
run;```

All Replies
Solution
‎05-18-2016 10:36 PM
Super Contributor
Posts: 319

## Re: Statistic to measure percentage of agreement between dichotomous variables

[ Edited ]

Hello,

No macro needed. This may be a good start:

```data have;
input t1 t2 t3 t4 t5;
datalines;
1  0  1  0  1
1  0  0  1  1
;

data want;
set have nobs=totalobs;
array t{*} t1--t5;
/*some automation needed here */
array countt{*} t1t2 t1t3 t1t4 t1t5 t2t3 t2t4 t2t5 t3t4 t3t5 t4t5;

counterperc=1;

do i=1 to dim(t);
countervars=i+1;
do j=counterperc to dim(countt) while (countervars le dim(t)) ;
countt{j} = (t{i}=t{countervars}) / totalobs;
countervars=countervars+1;
counterperc=counterperc+1;
end;
end;
drop countervars counterperc j i;
run;```
New Contributor
Posts: 4

## Re: Statistic to measure percentage of agreement between dichotomous variables

Thanks for this mate, along with this and using a proc summary as suggested by @FreelanceReinhard i've got it working perfectly.

Cheers.

Super User
Posts: 23,700

## Re: Statistic to measure percentage of agreement between dichotomous variables

You have a bit of comparisons. The data step code is more efficient but I thought this may b a useful read since it sounds like an agreement statistic, which is a proc freq.

http://support.sas.com/kb/24/170.html

Super User
Posts: 10,778

## Re: Statistic to measure percentage of agreement between dichotomous variables

It is more convenient for IML code .

What kind of output do you want ?

New Contributor
Posts: 4

## Re: Statistic to measure percentage of agreement between dichotomous variables

Something like this:

t1t2 t1t3 t1t4 .....
0.6 0.7 0.8 ......

So just one row with each column representing a comparison between two of
the variables.

attachments. -##
Posts: 1,248

## Re: Statistic to measure percentage of agreement between dichotomous variables

[ Edited ]

That's what @Loko's data step was made for. Just summarize the WANT dataset:

``````proc summary data=want;
var t1t2--t4t5;
output out=agreement(drop=_:) sum=;
run;``````

Super User
Posts: 10,778

## Re: Statistic to measure percentage of agreement between dichotomous variables

OK . Here is.
```
data have;
input t1 t2 t3 t4 t5;
datalines;
1  0  1  0  1
1  0  0  1  1
;
run;
proc transpose data=have(obs=0) out=temp1;run;
proc transpose data=temp1 out=temp2(drop=_name_ _label_);
var _name_;
run;
data _null_;
set temp2 end=last;
array x{*} \$ _character_;
if _n_=1 then call execute('proc sql;create table want as select ');
do i=1 to dim(x)-1;
do j=i+1 to dim(x);
call execute(catx(' ','sum(',x{i},'=',x{j},')/count(*) as ',cats(x{i},'_',x{j})));
if not (i=dim(x)-1 and j=dim(x)) then call execute(',');
end;
end;
if last then call execute('from have;quit;');
run;

```
🔒 This topic is solved and locked.