Hello,
I'm having an issue with how to write code for this. I have a data set that mirrors the sample set provided below. I am trying to calculate a risk score, whereby certain codes are 'binned' into groups which then have a certain weight assigned to them. By patient, each group is weighted, and the groups are added to create the final score (Charlson_score in this case). My issue is that the codes that fit within each group are not mutually exclusive (ie - two different codes which could be grouped similarly can be found for the same patient). Therefore if I use an accumulating column to add up the row_scores to calculate the final score, I am over-counting.
Here is the code I've written so far:
data want;
set have;
retain Charlson_score;
if first.sPatientID then do;
Charlson_score=0;
Row_score=0;
end;
if dxCd in: ("I21","I22","I23","I24") then CC_GRP1=1;
else if dxCd in: ("I43","I50","I51","I52") then CC_GRP2=1;
row_score=(CC_GRP1)*2 + (CC_GRP2)*3
Charlson_score=sum(Charlson_score+Row_score)
if last.sPatientID then output;
run;
Sample dataset:
sPatient ID dxCd
1 I21
1 E53
1 I22
1 I50
2 B43
2 C87
2 I52
2 I51
I would like the Charlson_score to be 5 for sPatientID 1, and 3 for sPatientID 2. Ie - if a multiple dxCds fit into CC_GRP1, I would only like them to be counted once. The issue is that by calculating the Charlson_score using an accumulating column this obviously becomes problematic.
Any advice would be much appreciated!
Thanks!