BookmarkSubscribeRSS Feed
Statsconsultancy
Fluorite | Level 6
I have data in the following format: It consists of patients and the numbers of diseases each patient experienced.

patient disease1 disease2 disease3;
1 I21 E102 C80
2 B20 C77 G81
3 B20 K27 G450
run;

I then assign weight to each disease using the following codes; The intention is to come up with total burden of the diseases. For example in patient1 I will add the weights of I21, E102 and C80. In partient2 I will add the weights of B20 C77 G81, and etc. I used the following codes to create the three columns and then add across the new three columns.

If disease1 = 'I21' then do;
weight1 = 1;
end;
else if disease1 = 'B20' then do;
weight1 = 6;
end;
else if disease2 = 'I71' then do;
weight2 = 1;
end;
else if disease2 = 'E102' then do;
weight2 = 2;
end;
else if disease2 = 'C77' then do;
weight2 = 3;
end;
else if disease2 = 'K27' then do;
weight2 = 1;
end;
else if disease3 = 'C80' then do;
weight3 = 3;
end;
else if disease3 = 'G81' then do;
weight3 = 2;
end;
else if disease3 = 'G450' then do;
weight3 = 1;
end;
RUN;
WEIGHT = weight1 + weight + weight3;
run;

Finally the data will be of the following format

Patient Disease1 disease2 disease3 weight1 weight2 weight3 weight
1 I21 E102 C80 1 2 3 6
2 B20 C77 G81 6 3 2 11
3 B20 K27 G450 6 1 1 8
run;
I was wondering if there is any other way of creating the new columns beside using the procedure that I have used. The problem with the procedure that I have used is that it can be a lot of work if patients have many diseases involved. Can someone help me.
3 REPLIES 3
Flip
Fluorite | Level 6
Several.
1. Create a format then weight1= input(put(disease1, $newfmt.), best.);
2. Create a lookup table with disease, and weight then do 3 set statements with a "KEY".
3. Create a lookup table then with SQL create columns as a.weight as weight1, b.weight as weight2, c.weight as weight3 from olddata o,
left join lookup a on o.disease1 = a.disease
left join lookup b on o.disease2 = b.disease
left join lookup c on o.disease3 = c.disease
data_null__
Jade | Level 19
To me the data structure not optimal. I would expect the patients to have varying number of diseases so a verticle structure is preferred. You can flattin it at any time if need.

An INFORMAT makes a nice lookup table in this situation where you want to read a code and create a numeric value.

[pre]
proc format;
invalue disease(upcase just)
'I21','I71','K27','G450' = 1
'E102','G81' = 2
'C80','C77' = 3
'B20' = 6
;
run;
data disease;
infile cards missover column=c;
input patient @;
do while(1);
_iorc_ = c;
input disease :$4. @;
if missing(disease) then leave;
input +(-(c-_iorc_)) weight :disease. @;
output;
end;
cards;
1 I21 E102 C80
2 B20 C77 G81 E102
3 B20 K27 G450
;;;;
run;
proc print;
run;

proc summary data=disease nway;
class patient;
output out=wide(drop=_:)
sum(weight)=
idgroup(out[4](disease weight)=)
;
run;
proc print;
run;
[/pre]
Ksharp
Super User
Hi. data null;'s code is too difficulty to me .
But I agree with Flip's suggestion.
You can create three format for three disease variables.





Ksharp Message was edited by: Ksharp

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 752 views
  • 0 likes
  • 4 in conversation