If I have a dataset like this: One patient can have multiple diagnosis codes, but it is formatted in a way that only one diagnosis code for one row.
Patient_ID | Diagnosis_code |
1 | 1 |
1 | 2 |
1 | 4 |
2 | 2 |
2 | 3 |
2 | 4 |
3 | 5 |
3 | 6 |
3 | 7 |
If I want to define a new variable called disease so that if a person's diagnosis codes contain either 2 or 3, then disease=1 otherwise disease=0. The result would be like this:
Patient_ID | Disease |
1 | 1 |
2 | 1 |
3 | 0 |
. How should I code it? Thank you.
Thanks @r_behata for posting the data in usable form.
@Boon_Noob Simple by-group processing does the job:
data want;
set have;
by Patient_ID;
retain Disease;
if first.Patient_ID then do;
Disease = 0;
end;
Disease = Disease or (Diagnosis_Code in ('2', '3'));
if last.Patient_ID then do;
output;
end;
drop Diagnosis_Code;
run;
data have;
input Patient_ID $ Diagnosis_code $;
infile cards delimiter='09'x;
cards;
1 1
1 2
1 4
2 2
2 3
2 4
3 5
3 6
3 7
run;
proc sql;
create table want
as
select distinct a.Patient_ID ,
case when b.Patient_ID ne ''
then 1
else 0 end as Disease
from have a
left join (select distinct
Patient_ID from have
where Diagnosis_code in ('2','3')
) b
on a.Patient_ID=b.Patient_ID;
quit;
Thanks @r_behata for posting the data in usable form.
@Boon_Noob Simple by-group processing does the job:
data want;
set have;
by Patient_ID;
retain Disease;
if first.Patient_ID then do;
Disease = 0;
end;
Disease = Disease or (Diagnosis_Code in ('2', '3'));
if last.Patient_ID then do;
output;
end;
drop Diagnosis_Code;
run;
A different method of BY processing, using a DO loop to work through a group:
data want;
disease = 0;
do until (last.patient_id);
set have;
by patient_id;
if diagnosis_code in ("2","3") then disease = 1;
end;
keep patient_id disease;
run;
data have;
input Patient_ID Diagnosis_code;
cards;
1 1
1 2
1 4
2 2
2 3
2 4
3 5
3 6
3 7
;
proc sql;
create table want as
select Patient_ID,max(Diagnosis_code in (2 3)) as Disease
from have
group by Patient_ID;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.