BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Boon_Noob
Calcite | Level 5

If I have a dataset like this: One patient can have multiple diagnosis codes, but it is formatted in a way that only one diagnosis code for one row.

Patient_IDDiagnosis_code
11
12
14
22
23
24
35
36
37

If I want to define a new variable called disease so that if a person's diagnosis codes contain either 2 or 3, then disease=1 otherwise disease=0. The result would be like this:

Patient_IDDisease
11
21
30

. How should I code it? Thank you. 

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Thanks @r_behata for posting the data in usable form.

 

@Boon_Noob Simple by-group processing does the job:

data want;
   set have;
   by Patient_ID;
   
   retain Disease;
   
   if first.Patient_ID then do;
      Disease = 0;
   end;
   
   Disease = Disease or (Diagnosis_Code in ('2', '3'));
   
   if last.Patient_ID then do;
      output;
   end;
   
   drop Diagnosis_Code;
run;

View solution in original post

4 REPLIES 4
r_behata
Barite | Level 11
data have;
input Patient_ID $	Diagnosis_code $;
infile cards delimiter='09'x;
cards;
1	1
1	2
1	4
2	2
2	3
2	4
3	5
3	6
3	7
run;


proc sql;
create table want
as 
select distinct a.Patient_ID ,
	   case when b.Patient_ID ne ''
	   	  then 1
	   	else 0 end as  Disease
 from have a
 left join (select distinct 
			Patient_ID from have
			where Diagnosis_code in ('2','3')
			) b
on a.Patient_ID=b.Patient_ID;

quit;
andreas_lds
Jade | Level 19

Thanks @r_behata for posting the data in usable form.

 

@Boon_Noob Simple by-group processing does the job:

data want;
   set have;
   by Patient_ID;
   
   retain Disease;
   
   if first.Patient_ID then do;
      Disease = 0;
   end;
   
   Disease = Disease or (Diagnosis_Code in ('2', '3'));
   
   if last.Patient_ID then do;
      output;
   end;
   
   drop Diagnosis_Code;
run;
Kurt_Bremser
Super User

A different method of BY processing, using a DO loop to work through a group:

data want;
disease = 0;
do until (last.patient_id);
  set have;
  by patient_id;
  if diagnosis_code in ("2","3") then disease = 1;
end;
keep patient_id disease;
run;
Ksharp
Super User
data have;
input Patient_ID Diagnosis_code;
cards;
1 1
1 2
1 4
2 2
2 3
2 4
3 5
3 6
3 7
;

proc sql;
create table want as
select Patient_ID,max(Diagnosis_code in (2 3)) as Disease
from have
group by Patient_ID;
quit;

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2221 views
  • 5 likes
  • 5 in conversation