Hi,
I have a dataset with say 295 patient observations, I would like to compare all patient pairs on a set of continuous variables, for simplicity let say on one continuous variable, let's call it ASCORE. So the dataset has 2 variables, PATIENTID and ASCORE
If patient 1 has a larger value for ASCORE than patient 2 then patient 1 earns a value of 1 for that comparison, while patient 2 earns a -1. However if both patients tie, then both earn a 0.
Next I would compare patient 1 to patient 3 and so on.
Eventually I would like to calculate the total score for each patient, defined as the sum of points from all their comparisons against the other n-1 patients?
I think the solution, lies in using the point option in a DATA SET step but there could be other ideas.
Any help provided would be great.
Thanks
Hi @Kimani and welcome to the SAS Support Communities!
You could also use PROC SQL:
proc sql;
create table want as
select a.patientid, sum(sign(a.ascore-b.ascore)) as total
from have a, have b
where a.patientid ne b.patientid
group by a.patientid;
quit;
But I think a more efficient solution would use PROC RANK:
proc rank data=have out=rks(drop=a:);
var ascore;
ranks r;
run;
data want(drop=r);
set rks nobs=n;
total=2*r-n-1;
run;
I have checked these suggestions with simulated data (not containing missing values, though).
Edit: Now I have also completed a mathematical proof of the formula total=2*r-n-1.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.