Comparing all pairs of records / observations in a dataset

Kimani — Thu, 08 Feb 2024 04:57:20 GMT

Hi,

I have a dataset with say 295 patient observations, I would like to compare all patient pairs on a set of continuous variables, for simplicity let say on one continuous variable, let's call it ASCORE. So the dataset has 2 variables, PATIENTID and ASCORE

If patient 1 has a larger value for ASCORE than patient 2 then patient 1 earns a value of 1 for that comparison, while patient 2 earns a -1. However if both patients tie, then both earn a 0.

Next I would compare patient 1 to patient 3 and so on.

Eventually I would like to calculate the total score for each patient, defined as the sum of points from all their comparisons against the other n-1 patients?

I think the solution, lies in using the point option in a DATA SET step but there could be other ideas.

Any help provided would be great.

Thanks

Re: Comparing all pairs of records / observations in a dataset

FreelanceReinh — Thu, 08 Feb 2024 12:34:24 GMT

Hi @Kimani and welcome to the SAS Support Communities!

You could also use PROC SQL:

proc sql;
create table want as
select a.patientid, sum(sign(a.ascore-b.ascore)) as total
from have a, have b
where a.patientid ne b.patientid
group by a.patientid;
quit;

But I think a more efficient solution would use PROC RANK:

proc rank data=have out=rks(drop=a:);
var ascore;
ranks r;
run;

data want(drop=r);
set rks nobs=n;
total=2*r-n-1;
run;

I have checked these suggestions with simulated data (not containing missing values, though).

Edit: Now I have also completed a mathematical proof of the formula total=2*r-n-1.

topic Re: Comparing all pairs of records / observations in a dataset in SAS Programming

Comparing all pairs of records / observations in a dataset

Re: Comparing all pairs of records / observations in a dataset