BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Boon_Noob
Calcite | Level 5

If I have a dataset like this: One patient can have multiple diagnosis codes, but it is formatted in a way that only one diagnosis code for one row.

Patient_IDDiagnosis_code
11
12
14
22
23
24
35
36
37

If I want to define a new variable called disease so that if a person's diagnosis codes contain either 2 or 3, then disease=1 otherwise disease=0. The result would be like this:

Patient_IDDisease
11
21
30

. How should I code it? Thank you. 

1 ACCEPTED SOLUTION

Accepted Solutions
andreas_lds
Jade | Level 19

Thanks @r_behata for posting the data in usable form.

 

@Boon_Noob Simple by-group processing does the job:

data want;
   set have;
   by Patient_ID;
   
   retain Disease;
   
   if first.Patient_ID then do;
      Disease = 0;
   end;
   
   Disease = Disease or (Diagnosis_Code in ('2', '3'));
   
   if last.Patient_ID then do;
      output;
   end;
   
   drop Diagnosis_Code;
run;

View solution in original post

4 REPLIES 4
r_behata
Barite | Level 11
data have;
input Patient_ID $	Diagnosis_code $;
infile cards delimiter='09'x;
cards;
1	1
1	2
1	4
2	2
2	3
2	4
3	5
3	6
3	7
run;


proc sql;
create table want
as 
select distinct a.Patient_ID ,
	   case when b.Patient_ID ne ''
	   	  then 1
	   	else 0 end as  Disease
 from have a
 left join (select distinct 
			Patient_ID from have
			where Diagnosis_code in ('2','3')
			) b
on a.Patient_ID=b.Patient_ID;

quit;
andreas_lds
Jade | Level 19

Thanks @r_behata for posting the data in usable form.

 

@Boon_Noob Simple by-group processing does the job:

data want;
   set have;
   by Patient_ID;
   
   retain Disease;
   
   if first.Patient_ID then do;
      Disease = 0;
   end;
   
   Disease = Disease or (Diagnosis_Code in ('2', '3'));
   
   if last.Patient_ID then do;
      output;
   end;
   
   drop Diagnosis_Code;
run;
Kurt_Bremser
Super User

A different method of BY processing, using a DO loop to work through a group:

data want;
disease = 0;
do until (last.patient_id);
  set have;
  by patient_id;
  if diagnosis_code in ("2","3") then disease = 1;
end;
keep patient_id disease;
run;
Ksharp
Super User
data have;
input Patient_ID Diagnosis_code;
cards;
1 1
1 2
1 4
2 2
2 3
2 4
3 5
3 6
3 7
;

proc sql;
create table want as
select Patient_ID,max(Diagnosis_code in (2 3)) as Disease
from have
group by Patient_ID;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1437 views
  • 5 likes
  • 5 in conversation