BookmarkSubscribeRSS Feed
Statsconsultancy
Fluorite | Level 6
I have data in the following format: It consists of patients and the numbers of diseases each patient experienced.

patient disease1 disease2 disease3;
1 I21 E102 C80
2 B20 C77 G81
3 B20 K27 G450
run;

I then assign weight to each disease using the following codes; The intention is to come up with total burden of the diseases. For example in patient1 I will add the weights of I21, E102 and C80. In partient2 I will add the weights of B20 C77 G81, and etc. I used the following codes to create the three columns and then add across the new three columns.

If disease1 = 'I21' then do;
weight1 = 1;
end;
else if disease1 = 'B20' then do;
weight1 = 6;
end;
else if disease2 = 'I71' then do;
weight2 = 1;
end;
else if disease2 = 'E102' then do;
weight2 = 2;
end;
else if disease2 = 'C77' then do;
weight2 = 3;
end;
else if disease2 = 'K27' then do;
weight2 = 1;
end;
else if disease3 = 'C80' then do;
weight3 = 3;
end;
else if disease3 = 'G81' then do;
weight3 = 2;
end;
else if disease3 = 'G450' then do;
weight3 = 1;
end;
RUN;
WEIGHT = weight1 + weight + weight3;
run;

Finally the data will be of the following format

Patient Disease1 disease2 disease3 weight1 weight2 weight3 weight
1 I21 E102 C80 1 2 3 6
2 B20 C77 G81 6 3 2 11
3 B20 K27 G450 6 1 1 8
run;
I was wondering if there is any other way of creating the new columns beside using the procedure that I have used. The problem with the procedure that I have used is that it can be a lot of work if patients have many diseases involved. Can someone help me.
3 REPLIES 3
Flip
Fluorite | Level 6
Several.
1. Create a format then weight1= input(put(disease1, $newfmt.), best.);
2. Create a lookup table with disease, and weight then do 3 set statements with a "KEY".
3. Create a lookup table then with SQL create columns as a.weight as weight1, b.weight as weight2, c.weight as weight3 from olddata o,
left join lookup a on o.disease1 = a.disease
left join lookup b on o.disease2 = b.disease
left join lookup c on o.disease3 = c.disease
data_null__
Jade | Level 19
To me the data structure not optimal. I would expect the patients to have varying number of diseases so a verticle structure is preferred. You can flattin it at any time if need.

An INFORMAT makes a nice lookup table in this situation where you want to read a code and create a numeric value.

[pre]
proc format;
invalue disease(upcase just)
'I21','I71','K27','G450' = 1
'E102','G81' = 2
'C80','C77' = 3
'B20' = 6
;
run;
data disease;
infile cards missover column=c;
input patient @;
do while(1);
_iorc_ = c;
input disease :$4. @;
if missing(disease) then leave;
input +(-(c-_iorc_)) weight :disease. @;
output;
end;
cards;
1 I21 E102 C80
2 B20 C77 G81 E102
3 B20 K27 G450
;;;;
run;
proc print;
run;

proc summary data=disease nway;
class patient;
output out=wide(drop=_:)
sum(weight)=
idgroup(out[4](disease weight)=)
;
run;
proc print;
run;
[/pre]
Ksharp
Super User
Hi. data null;'s code is too difficulty to me .
But I agree with Flip's suggestion.
You can create three format for three disease variables.





Ksharp Message was edited by: Ksharp

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1282 views
  • 0 likes
  • 4 in conversation