If I have a dataset like this: One patient can have multiple diagnosis codes, but it is formatted in a way that only one diagnosis code for one row.
| Patient_ID | Diagnosis_code | 
| 1 | 1 | 
| 1 | 2 | 
| 1 | 4 | 
| 2 | 2 | 
| 2 | 3 | 
| 2 | 4 | 
| 3 | 5 | 
| 3 | 6 | 
| 3 | 7 | 
If I want to define a new variable called disease so that if a person's diagnosis codes contain either 2 or 3, then disease=1 otherwise disease=0. The result would be like this:
| Patient_ID | Disease | 
| 1 | 1 | 
| 2 | 1 | 
| 3 | 0 | 
. How should I code it? Thank you.
Thanks @r_behata for posting the data in usable form.
@Boon_Noob Simple by-group processing does the job:
data want;
   set have;
   by Patient_ID;
   
   retain Disease;
   
   if first.Patient_ID then do;
      Disease = 0;
   end;
   
   Disease = Disease or (Diagnosis_Code in ('2', '3'));
   
   if last.Patient_ID then do;
      output;
   end;
   
   drop Diagnosis_Code;
run;data have;
input Patient_ID $	Diagnosis_code $;
infile cards delimiter='09'x;
cards;
1	1
1	2
1	4
2	2
2	3
2	4
3	5
3	6
3	7
run;
proc sql;
create table want
as 
select distinct a.Patient_ID ,
	   case when b.Patient_ID ne ''
	   	  then 1
	   	else 0 end as  Disease
 from have a
 left join (select distinct 
			Patient_ID from have
			where Diagnosis_code in ('2','3')
			) b
on a.Patient_ID=b.Patient_ID;
quit;Thanks @r_behata for posting the data in usable form.
@Boon_Noob Simple by-group processing does the job:
data want;
   set have;
   by Patient_ID;
   
   retain Disease;
   
   if first.Patient_ID then do;
      Disease = 0;
   end;
   
   Disease = Disease or (Diagnosis_Code in ('2', '3'));
   
   if last.Patient_ID then do;
      output;
   end;
   
   drop Diagnosis_Code;
run;A different method of BY processing, using a DO loop to work through a group:
data want;
disease = 0;
do until (last.patient_id);
  set have;
  by patient_id;
  if diagnosis_code in ("2","3") then disease = 1;
end;
keep patient_id disease;
run;data have;
input Patient_ID Diagnosis_code;
cards;
1 1
1 2
1 4
2 2
2 3
2 4
3 5
3 6
3 7
;
proc sql;
create table want as
select Patient_ID,max(Diagnosis_code in (2 3)) as Disease
from have
group by Patient_ID;
quit;It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
