DATA Step, Macro, Functions and more

Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple entrie

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 9
Accepted Solution

Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple entrie

Hey there,

 

So I have a large patient dataset where patients often have multiple entries under the same patient ID. Each entry has a column for patient ID, year of admission, whether they are a smoker (0 = no, 1 = yes), and whether they were admitted to deliver a child (0=no, 1 =yes) so the data looks like this

 

PatientYear admittedSmoking FlagDelivery Flag
Patient x200200
Patient x200310
Patient x200401
Patient x200601
Patient x200811
Patient y200401
Patient z200110
Patient z200101

 

How can I return a list of patients who smoked the year of or the year before delivering a child (as well as the year that it occured).


Accepted Solutions
Solution
‎06-16-2018 10:23 PM
PROC Star
Posts: 1,799

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

data have;
input Patient& $20.	Yearadmitted	SmokingFlag	DeliveryFlag ;
cards;
Patient x	2002	0	0
Patient x	2003	1	0
Patient x	2004	0	1
Patient x	2006	0	1
Patient x	2008	1	1
Patient y	2004	0	1
Patient z	2001	1	0
Patient z	2001	0	1
;
data want;
set have;
by patient;
retain t;
if first.patient then call missing(t);
if 	DeliveryFlag and t then if Yearadmitted-t in (0,1) then output;
if 	SmokingFlag=1 then t=Yearadmitted;
run;

I am not sure why you didn't include 2008 in your expected output though

View solution in original post


All Replies
Super User
Posts: 23,704

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

Are these actually years or full dates? 

 

Assuming your data is sorted correctly:

 

data want; 
set have;

by patient year_admitted;

prev_smoke = lag(smoke);
prev_year = lag(year);

if first.patient then call missing(prev_smoke, prev_year);
smoke_prev_year = .;

if delivery_flag=1  /*if delivered*/
    and year-prev_year=1  /*if previous record is the previous year*/
     and (smoking_flag=1 or prev_smoke=1) /*smoked in current or previous year*/ 
           then smoke_prev_year = 1;
else if delivery_flag=1 then smoke_prev_year=0; /*only set flag if delivered*/


run;

@jsimmo02 wrote:

Hey there,

 

So I have a large patient dataset where patients often have multiple entries under the same patient ID. Each entry has a column for patient ID, year of admission, whether they are a smoker (0 = no, 1 = yes), and whether they were admitted to deliver a child (0=no, 1 =yes) so the data looks like this

 

Patient Year admitted Smoking Flag Delivery Flag
Patient x 2002 0 0
Patient x 2003 1 0
Patient x 2004 0 1
Patient x 2006 0 1
Patient x 2008 1 1
Patient y 2004 0 1
Patient z 2001 1 0
Patient z 2001 0 1

 

How can I return a list of patients who smoked the year of or the year before delivering a child (as well as the year that it occured).


 

Super User
Posts: 6,765

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

A couple of questions ...

 

First, there are some years that don't appear.  For example, Patient X has no data for 2005.  Can we assume that smoker=no for those years?  (If not, how do we determine smoking status?)

 

Clearly, you want to flag this combination:  delivery=yes, smoker=yes, and smoker=yes for the previous year.

 

Did you also want to flag:  delivery=yes, smoker=no, but smoker=yes for the previous year?

 

Did you also want to flag:  delivery=yes, smoker=yes, but smoker=no for the previous year?

Occasional Contributor
Posts: 9

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

[ Edited ]
Posted in reply to Astounding

First, there are some years that don't appear.  For example, Patient X has no data for 2005.  Can we assume that smoker=no for those years?  (If not, how do we determine smoking status?) Yes, you can assume smoker = no for those years.

 

"Clearly, you want to flag this combination:  delivery=yes, smoker=yes, and smoker=yes for the previous year."

 

I am hoping for delivery = yes and (smoker=yes OR smoker for previous year = yes)

 

Did you also want to flag:  delivery=yes, smoker=no, but smoker=yes for the previous year? YES

 

Did you also want to flag:  delivery=yes, smoker=yes, but smoker=no for the previous year? YES

PROC Star
Posts: 1,799

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

@jsimmo02  Can you please post a sample of your expect output for your input sample? Thank you

Occasional Contributor
Posts: 9

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

Posted in reply to novinosrin

The hopeful output would be

PatientAllconditionsmetYear
Patient xY2004
Patient zY2001

 

but if I can just create a flag for patients who meet all the criteria, I would be happy. Having the year return would be a bonus

Solution
‎06-16-2018 10:23 PM
PROC Star
Posts: 1,799

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

data have;
input Patient& $20.	Yearadmitted	SmokingFlag	DeliveryFlag ;
cards;
Patient x	2002	0	0
Patient x	2003	1	0
Patient x	2004	0	1
Patient x	2006	0	1
Patient x	2008	1	1
Patient y	2004	0	1
Patient z	2001	1	0
Patient z	2001	0	1
;
data want;
set have;
by patient;
retain t;
if first.patient then call missing(t);
if 	DeliveryFlag and t then if Yearadmitted-t in (0,1) then output;
if 	SmokingFlag=1 then t=Yearadmitted;
run;

I am not sure why you didn't include 2008 in your expected output though

Occasional Contributor
Posts: 9

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

Posted in reply to novinosrin

i should have

Occasional Contributor
Posts: 9

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

Posted in reply to novinosrin

Yes! YES! Nailed it

Respected Advisor
Posts: 4,736

Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

@jsimmo02

Posting below even though things have moved on while I've been working out the code. 

It's just another coding option for what you've got already.

data have;
  infile datalines dlm='09'x truncover;
  input Patient $	Year_admitted	Smoking_Flag	Delivery_Flag;
datalines;
x	2002	0	0
x	2003	1	0
x	2004	0	1
x	2006	0	1
x	2008	1	1
y	2004	0	1
z	2001	1	0
z	2001	0	1
;
run;

data want;
  set have;
  by Patient notsorted Year_admitted;

  _r_sflg=lag(Smoking_Flag);

  if first.patient then return;
  
  /* also include delivery year for smoking flag */
  AllConditionsMet=  ( Delivery_Flag=1 and (Smoking_Flag=1 or _r_sflg=1) );

  if AllConditionsMet=1 then output;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 147 views
  • 0 likes
  • 5 in conversation