## Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple entrie

Solved
Occasional Contributor
Posts: 9

# Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple entrie

Hey there,

So I have a large patient dataset where patients often have multiple entries under the same patient ID. Each entry has a column for patient ID, year of admission, whether they are a smoker (0 = no, 1 = yes), and whether they were admitted to deliver a child (0=no, 1 =yes) so the data looks like this

 Patient Year admitted Smoking Flag Delivery Flag Patient x 2002 0 0 Patient x 2003 1 0 Patient x 2004 0 1 Patient x 2006 0 1 Patient x 2008 1 1 Patient y 2004 0 1 Patient z 2001 1 0 Patient z 2001 0 1

How can I return a list of patients who smoked the year of or the year before delivering a child (as well as the year that it occured).

Accepted Solutions
Solution
‎06-16-2018 10:23 PM
PROC Star
Posts: 1,799

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

``````data have;
input Patient& \$20.	Yearadmitted	SmokingFlag	DeliveryFlag ;
cards;
Patient x	2002	0	0
Patient x	2003	1	0
Patient x	2004	0	1
Patient x	2006	0	1
Patient x	2008	1	1
Patient y	2004	0	1
Patient z	2001	1	0
Patient z	2001	0	1
;
data want;
set have;
by patient;
retain t;
if first.patient then call missing(t);
if 	DeliveryFlag and t then if Yearadmitted-t in (0,1) then output;
run;``````

I am not sure why you didn't include 2008 in your expected output though

All Replies
Super User
Posts: 23,704

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

Are these actually years or full dates?

Assuming your data is sorted correctly:

``````data want;
set have;

prev_smoke = lag(smoke);
prev_year = lag(year);

if first.patient then call missing(prev_smoke, prev_year);
smoke_prev_year = .;

if delivery_flag=1  /*if delivered*/
and year-prev_year=1  /*if previous record is the previous year*/
and (smoking_flag=1 or prev_smoke=1) /*smoked in current or previous year*/
then smoke_prev_year = 1;
else if delivery_flag=1 then smoke_prev_year=0; /*only set flag if delivered*/

run;``````

@jsimmo02 wrote:

Hey there,

So I have a large patient dataset where patients often have multiple entries under the same patient ID. Each entry has a column for patient ID, year of admission, whether they are a smoker (0 = no, 1 = yes), and whether they were admitted to deliver a child (0=no, 1 =yes) so the data looks like this

 Patient Year admitted Smoking Flag Delivery Flag Patient x 2002 0 0 Patient x 2003 1 0 Patient x 2004 0 1 Patient x 2006 0 1 Patient x 2008 1 1 Patient y 2004 0 1 Patient z 2001 1 0 Patient z 2001 0 1

How can I return a list of patients who smoked the year of or the year before delivering a child (as well as the year that it occured).

Super User
Posts: 6,765

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

A couple of questions ...

First, there are some years that don't appear.  For example, Patient X has no data for 2005.  Can we assume that smoker=no for those years?  (If not, how do we determine smoking status?)

Clearly, you want to flag this combination:  delivery=yes, smoker=yes, and smoker=yes for the previous year.

Did you also want to flag:  delivery=yes, smoker=no, but smoker=yes for the previous year?

Did you also want to flag:  delivery=yes, smoker=yes, but smoker=no for the previous year?

Occasional Contributor
Posts: 9

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

[ Edited ]

First, there are some years that don't appear.  For example, Patient X has no data for 2005.  Can we assume that smoker=no for those years?  (If not, how do we determine smoking status?) Yes, you can assume smoker = no for those years.

"Clearly, you want to flag this combination:  delivery=yes, smoker=yes, and smoker=yes for the previous year."

I am hoping for delivery = yes and (smoker=yes OR smoker for previous year = yes)

Did you also want to flag:  delivery=yes, smoker=no, but smoker=yes for the previous year? YES

Did you also want to flag:  delivery=yes, smoker=yes, but smoker=no for the previous year? YES

PROC Star
Posts: 1,799

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

@jsimmo02  Can you please post a sample of your expect output for your input sample? Thank you

Occasional Contributor
Posts: 9

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

The hopeful output would be

 Patient Allconditionsmet Year Patient x Y 2004 Patient z Y 2001

but if I can just create a flag for patients who meet all the criteria, I would be happy. Having the year return would be a bonus

Solution
‎06-16-2018 10:23 PM
PROC Star
Posts: 1,799

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

``````data have;
input Patient& \$20.	Yearadmitted	SmokingFlag	DeliveryFlag ;
cards;
Patient x	2002	0	0
Patient x	2003	1	0
Patient x	2004	0	1
Patient x	2006	0	1
Patient x	2008	1	1
Patient y	2004	0	1
Patient z	2001	1	0
Patient z	2001	0	1
;
data want;
set have;
by patient;
retain t;
if first.patient then call missing(t);
if 	DeliveryFlag and t then if Yearadmitted-t in (0,1) then output;
run;``````

I am not sure why you didn't include 2008 in your expected output though

Occasional Contributor
Posts: 9

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

i should have

Occasional Contributor
Posts: 9

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

Yes! YES! Nailed it

Posts: 4,736

## Re: Find patient a diagnosis within a year of delivering a child in dataset where pt has multiple en

@jsimmo02

Posting below even though things have moved on while I've been working out the code.

It's just another coding option for what you've got already.

``````data have;
infile datalines dlm='09'x truncover;
input Patient \$	Year_admitted	Smoking_Flag	Delivery_Flag;
datalines;
x	2002	0	0
x	2003	1	0
x	2004	0	1
x	2006	0	1
x	2008	1	1
y	2004	0	1
z	2001	1	0
z	2001	0	1
;
run;

data want;
set have;

_r_sflg=lag(Smoking_Flag);

if first.patient then return;

/* also include delivery year for smoking flag */
AllConditionsMet=  ( Delivery_Flag=1 and (Smoking_Flag=1 or _r_sflg=1) );

if AllConditionsMet=1 then output;
run;``````
☑ This topic is solved.