Hello,
I'm having an issue with how to write code for this. I have a data set that mirrors the sample set provided below. I am trying to calculate a risk score, whereby certain codes are 'binned' into groups which then have a certain weight assigned to them. By patient, each group is weighted, and the groups are added to create the final score (Charlson_score in this case). My issue is that the codes that fit within each group are not mutually exclusive (ie - two different codes which could be grouped similarly can be found for the same patient). Therefore if I use an accumulating column to add up the row_scores to calculate the final score, I am over-counting.
Here is the code I've written so far:
data want;
set have;
retain Charlson_score;
if first.sPatientID then do;
Charlson_score=0;
Row_score=0;
end;
if dxCd in: ("I21","I22","I23","I24") then CC_GRP1=1;
else if dxCd in: ("I43","I50","I51","I52") then CC_GRP2=1;
row_score=(CC_GRP1)*2 + (CC_GRP2)*3
Charlson_score=sum(Charlson_score+Row_score)
if last.sPatientID then output;
run;
Sample dataset:
sPatient ID dxCd
1 I21
1 E53
1 I22
1 I50
2 B43
2 C87
2 I52
2 I51
I would like the Charlson_score to be 5 for sPatientID 1, and 3 for sPatientID 2. Ie - if a multiple dxCds fit into CC_GRP1, I would only like them to be counted once. The issue is that by calculating the Charlson_score using an accumulating column this obviously becomes problematic.
Any advice would be much appreciated!
Thanks!
If I understood your post correctly, the logic below should accomplish what you are looking for with your code:
/*********************/
/** SAMPLE DATASET **/
/*********************/
DATA have;
FORMAT sPatientID 8. dxCd $3.;
INFORMAT sPatientID 8. dxCd $3.;
INPUT sPatientID dxCd;
CARDS;
1 I21
1 E53
1 I22
1 I50
2 B43
2 C87
2 I52
2 I51
;
PROC SORT DATA=WORK.Have; BY sPatientID; RUN;
DATA WORK.Want (DROP=CC_GRP1 CC_GRP2);
SET WORK.Have;
BY sPatientID;
RETAIN CC_GRP1
CC_GRP2;
IF FIRST.sPatientID THEN DO;
Charlson_score=0;
CC_GRP1=0;
CC_GRP2=0;
END;
IF dxCd IN: ("I21","I22","I23","I24") THEN CC_GRP1=1;
ELSE IF dxCd IN: ("I43","I50","I51","I52") THEN CC_GRP2=1;
IF LAST.sPatientID THEN DO;
Charlson_score=(CC_GRP1)*2 + (CC_GRP2)*3;
OUTPUT;
END;
RUN;
This is the output that was produced using my sample dataset:
sPatientID dxCd Charlson_score
1 I50 5
2 I51 3
Hope this helps.
If I understood your post correctly, the logic below should accomplish what you are looking for with your code:
/*********************/
/** SAMPLE DATASET **/
/*********************/
DATA have;
FORMAT sPatientID 8. dxCd $3.;
INFORMAT sPatientID 8. dxCd $3.;
INPUT sPatientID dxCd;
CARDS;
1 I21
1 E53
1 I22
1 I50
2 B43
2 C87
2 I52
2 I51
;
PROC SORT DATA=WORK.Have; BY sPatientID; RUN;
DATA WORK.Want (DROP=CC_GRP1 CC_GRP2);
SET WORK.Have;
BY sPatientID;
RETAIN CC_GRP1
CC_GRP2;
IF FIRST.sPatientID THEN DO;
Charlson_score=0;
CC_GRP1=0;
CC_GRP2=0;
END;
IF dxCd IN: ("I21","I22","I23","I24") THEN CC_GRP1=1;
ELSE IF dxCd IN: ("I43","I50","I51","I52") THEN CC_GRP2=1;
IF LAST.sPatientID THEN DO;
Charlson_score=(CC_GRP1)*2 + (CC_GRP2)*3;
OUTPUT;
END;
RUN;
This is the output that was produced using my sample dataset:
sPatientID dxCd Charlson_score
1 I50 5
2 I51 3
Hope this helps.
Nearly 200 sessions are now available on demand with the SAS Innovate Digital Pass.
Explore Now →SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.