Dear Experts,
I need a patient count for the condition based on the cancer occurrence. I'll share the sample dataset and the expected result for your reference.
data have;
input ID Date mmddyy10. Inf Cancer;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
run;
I required 2 different outputs based on 2 conditions to get the patient IDs.
Kindly suggests a code to get the patient IDs list at the end of the result.
Note: The first occurrence of Cancer is matters but not the following occurrence.
This code gets your intended result, see if it also fits more complicated input data:
data have;
input ID Date mmddyy10. Inf Cancer;
format date MMDDYY10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;
proc sql;
create table after as
select distinct a.id
from
have (where=(inf = 1)) a,
(select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date >= b.date;
create table before as
select distinct a.id
from
have (where=(inf = 1)) a,
(select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date < b.date;
quit;
This code gets your intended result, see if it also fits more complicated input data:
data have;
input ID Date mmddyy10. Inf Cancer;
format date MMDDYY10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;
proc sql;
create table after as
select distinct a.id
from
have (where=(inf = 1)) a,
(select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date >= b.date;
create table before as
select distinct a.id
from
have (where=(inf = 1)) a,
(select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date < b.date;
quit;
data have;
input ID Date mmddyy10. Inf Cancer;
format date mmddyy10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;
proc sql;
/*Report 1 On/After*/
create table report1 as
select *
from have
group by id
having max(cancer=1);
/*Report 2 Before*/
create table report2 as
select *
from have
group by id
having min(ifn(inf=1,date,.)) < min(ifn(cancer=1,date,.)) ;
quit;
Since it appears your data is sorted by ID Date, Datastep seems a lot easier IMHO
data have;
input ID Date mmddyy10. Inf Cancer;
format date mmddyy10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;
data report1 report2;
do _n_=1 by 1 until(last.id);
set have;
by id;
if nmiss(_d,_d1)=0 then continue;
if not _d1 and Cancer=1 then _d1=date;
if not _d and inf=1 then _d=date;
end;
do _n_=1 to _n_;
set have;
if _d1 then output report1;
if _d<_d1 then output report2;
end;
drop _:;
run;
Thank you for always stepping in to help when I need you most. @novinosrin @Kurt_Bremser it really meant a lot.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.