BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
bretthouston
Obsidian | Level 7

Hello,

 

I'm having an issue with how to write code for this. I have a data set that mirrors the sample set provided below. I am trying to calculate a risk score, whereby certain codes are 'binned' into groups which then have a certain weight assigned to them. By patient, each group is weighted, and the groups are added to create the final score (Charlson_score in this case). My issue is that the codes that fit within each group are not mutually exclusive (ie - two different codes which could be grouped similarly can be found for the same patient). Therefore if I use an accumulating column to add up the row_scores to calculate the final score, I am over-counting.

 

Here is the code I've written so far:

data want;

set have;

 

retain Charlson_score;

if first.sPatientID then do;

      Charlson_score=0;

      Row_score=0;

end;

 

if dxCd in: ("I21","I22","I23","I24") then CC_GRP1=1;

else if dxCd in: ("I43","I50","I51","I52") then CC_GRP2=1;

 

row_score=(CC_GRP1)*2 + (CC_GRP2)*3

Charlson_score=sum(Charlson_score+Row_score)

 

if last.sPatientID then output;

run;

 

Sample dataset:

sPatient ID     dxCd 

1                     I21

1                     E53

1                     I22

1                     I50

2                     B43

2                     C87

2                     I52

2                     I51

 

I would like the Charlson_score to be 5 for sPatientID 1, and 3 for sPatientID 2. Ie - if a multiple dxCds fit into CC_GRP1, I would only like them to be counted once. The issue is that by calculating the Charlson_score using an accumulating column this obviously becomes problematic. 

 

Any advice would be much appreciated!

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
tsap
Pyrite | Level 9

If I understood your post correctly, the logic below should accomplish what you are looking for with your code:

/*********************/
/**	SAMPLE DATASET	**/
/*********************/
DATA have;
FORMAT 		sPatientID 8. dxCd $3.; 
INFORMAT	sPatientID 8. dxCd $3.; 
INPUT  		sPatientID    dxCd; 
CARDS;      
1                     I21
1                     E53
1                     I22
1                     I50
2                     B43
2                     C87
2                     I52
2                     I51
;


PROC SORT DATA=WORK.Have;	BY sPatientID; RUN;


DATA WORK.Want	(DROP=CC_GRP1 CC_GRP2);
	SET WORK.Have;
	BY sPatientID;
	RETAIN CC_GRP1
		   CC_GRP2;
	IF FIRST.sPatientID THEN DO;
	      Charlson_score=0;
	      CC_GRP1=0;
		  CC_GRP2=0;
	END;

		 IF dxCd IN: ("I21","I22","I23","I24") THEN CC_GRP1=1;
	ELSE IF dxCd IN: ("I43","I50","I51","I52") THEN CC_GRP2=1;
	IF LAST.sPatientID THEN DO;
		Charlson_score=(CC_GRP1)*2 + (CC_GRP2)*3;
		OUTPUT;
	END;
RUN;

 

 

This is the output that was produced using my sample dataset:

sPatientID	dxCd	Charlson_score
         1	I50	             5
         2	I51	             3

 

Hope this helps.

View solution in original post

1 REPLY 1
tsap
Pyrite | Level 9

If I understood your post correctly, the logic below should accomplish what you are looking for with your code:

/*********************/
/**	SAMPLE DATASET	**/
/*********************/
DATA have;
FORMAT 		sPatientID 8. dxCd $3.; 
INFORMAT	sPatientID 8. dxCd $3.; 
INPUT  		sPatientID    dxCd; 
CARDS;      
1                     I21
1                     E53
1                     I22
1                     I50
2                     B43
2                     C87
2                     I52
2                     I51
;


PROC SORT DATA=WORK.Have;	BY sPatientID; RUN;


DATA WORK.Want	(DROP=CC_GRP1 CC_GRP2);
	SET WORK.Have;
	BY sPatientID;
	RETAIN CC_GRP1
		   CC_GRP2;
	IF FIRST.sPatientID THEN DO;
	      Charlson_score=0;
	      CC_GRP1=0;
		  CC_GRP2=0;
	END;

		 IF dxCd IN: ("I21","I22","I23","I24") THEN CC_GRP1=1;
	ELSE IF dxCd IN: ("I43","I50","I51","I52") THEN CC_GRP2=1;
	IF LAST.sPatientID THEN DO;
		Charlson_score=(CC_GRP1)*2 + (CC_GRP2)*3;
		OUTPUT;
	END;
RUN;

 

 

This is the output that was produced using my sample dataset:

sPatientID	dxCd	Charlson_score
         1	I50	             5
         2	I51	             3

 

Hope this helps.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 1 reply
  • 577 views
  • 0 likes
  • 2 in conversation