Hi,
I have a dataset with say 295 patient observations, I would like to compare all patient pairs on a set of continuous variables, for simplicity let say on one continuous variable, let's call it ASCORE. So the dataset has 2 variables, PATIENTID and ASCORE
If patient 1 has a larger value for ASCORE than patient 2 then patient 1 earns a value of 1 for that comparison, while patient 2 earns a -1. However if both patients tie, then both earn a 0.
Next I would compare patient 1 to patient 3 and so on.
Eventually I would like to calculate the total score for each patient, defined as the sum of points from all their comparisons against the other n-1 patients?
I think the solution, lies in using the point option in a DATA SET step but there could be other ideas.
Any help provided would be great.
Thanks
Hi @Kimani and welcome to the SAS Support Communities!
You could also use PROC SQL:
proc sql;
create table want as
select a.patientid, sum(sign(a.ascore-b.ascore)) as total
from have a, have b
where a.patientid ne b.patientid
group by a.patientid;
quit;
But I think a more efficient solution would use PROC RANK:
proc rank data=have out=rks(drop=a:);
var ascore;
ranks r;
run;
data want(drop=r);
set rks nobs=n;
total=2*r-n-1;
run;
I have checked these suggestions with simulated data (not containing missing values, though).
Edit: Now I have also completed a mathematical proof of the formula total=2*r-n-1.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.