Solved: Re: Patient_count based on cancer occurrence column

Sathish_jammy · Posted 02-04-2020 07:07 AM

Dear Experts,

I need a patient count for the condition based on the cancer occurrence. I'll share the sample dataset and the expected result for your reference.

data have;
input ID Date mmddyy10. Inf Cancer;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
run;

I required 2 different outputs based on 2 conditions to get the patient IDs.

Patients who are Infected(Inf=1) after/same date on cancer (cancer=1) Expected Output(IDs 123,159,789).
Patients who are Infected(Inf=1) before cancer (cancer=1) Expected Output(IDs 123,789).

Kindly suggests a code to get the patient IDs list at the end of the result.

Note: The first occurrence of Cancer is matters but not the following occurrence.

Kurt_Bremser · Posted 02-04-2020 07:24 AM

This code gets your intended result, see if it also fits more complicated input data:

data have;
input ID Date mmddyy10. Inf Cancer;
format date MMDDYY10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;

proc sql;
create table after as
select distinct a.id
from
  have (where=(inf = 1)) a,
  (select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date >= b.date;
create table before as
select distinct a.id
from
  have (where=(inf = 1)) a,
  (select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date < b.date;
quit;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 02-04-2020 07:24 AM

This code gets your intended result, see if it also fits more complicated input data:

data have;
input ID Date mmddyy10. Inf Cancer;
format date MMDDYY10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;

proc sql;
create table after as
select distinct a.id
from
  have (where=(inf = 1)) a,
  (select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date >= b.date;
create table before as
select distinct a.id
from
  have (where=(inf = 1)) a,
  (select id, min(date) as date from have (where=(cancer = 1)) group by id) b
where a.id = b.id and a.date < b.date;
quit;

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

novinosrin · Posted 02-04-2020 07:24 AM


data have;
input ID Date mmddyy10. Inf Cancer;
format date mmddyy10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;
proc sql;
/*Report 1 On/After*/
create table report1 as
select *
from have
group by id
having max(cancer=1);
/*Report 2 Before*/
create table report2 as
select *
from have
group by id
having min(ifn(inf=1,date,.)) < min(ifn(cancer=1,date,.)) ;
quit;

novinosrin · Posted 02-04-2020 07:53 AM

Since it appears your data is sorted by ID Date, Datastep seems a lot easier IMHO

data have;
input ID Date mmddyy10. Inf Cancer;
format date mmddyy10.;
cards;
123 05/05/2000 1 0
123 08/07/2001 0 1
123 06/07/2002 1 0
159 01/03/2001 1 1
159 02/08/2002 0 1
618 07/07/2005 0 0
618 05/03/2006 1 0
789 06/06/2000 1 0
789 04/02/2001 0 1
789 03/03/2002 1 0
789 03/03/2002 0 0
;
data report1 report2;
 do _n_=1 by 1 until(last.id);
  set have;
  by id;
  if nmiss(_d,_d1)=0 then continue;
  if not _d1 and  Cancer=1 then _d1=date;
  if not _d and inf=1 then _d=date;
 end;
 do _n_=1 to _n_;
  set have;
  if _d1 then output report1;
  if _d<_d1 then output report2;
 end;
 drop _:;
run;

Sathish_jammy · Posted 02-04-2020 08:59 AM

Thank you for always stepping in to help when I need you most. @novinosrin @Kurt_Bremser it really meant a lot.

Patient_count based on cancer occurrence column

Re: Patient_count based on cancer occurrence column

Re: Patient_count based on cancer occurrence column

Re: Patient_count based on cancer occurrence column

Re: Patient_count based on cancer occurrence column

Re: Patient_count based on cancer occurrence column

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away