DATA Step, Macro, Functions and more

Use Retain to correct values based on the previous record

Reply
Frequent Contributor
Posts: 101

Use Retain to correct values based on the previous record

I have vertical data recording a person 's in and out of hospital and reason for leaving. I'm trying to fix people who have multiple deaths. The data is vertical: A person (PID) moved from one hospital to another (HospID), and within the same hospital, multiple records to show people's leaving and returning. See below for a sample of 5 cases. Also the actual data is date, not year, and there is no date comflict). 

 

Naturally,
1. a person can only die once (so records for ID 3 & 4 are wrong)
2. if there is subsequent records after death, death should reset as unknown (DK). (so record for #5 is wrong).

 

So ID 1 and 2 are OK. The ^1st record of ID 3^, the ^2nd record of ID 4^ and the ^2nd record of ID 5^ should be set as DK.
I was thinking to use lag function but it doesn't do the work (or can be messy). 
It seems I should use retain to retain value from the previous record (aftert sorting the data and set discharge year missing to be say 2019) and then reset the value as I move from one record to the next. But how do I execute?

 

Thanks in advance for any help you can give me.

 

data have; input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015
;
run;
proc print noobs; run;

 

Super Contributor
Posts: 251

Re: Use Retain to correct values based on the previous record

Well, this does what you are asking for:

data want;
set have;
by pid;
if not last.pid and reason_to_leave = 'death' then
   reason_to_leave = 'DK';
run;

Is it as simple as that?

 

Super User
Posts: 9,662

Re: Use Retain to correct values based on the previous record

Better post the output too.

 

 
data have; input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015
;
run;
proc sort data=have;
 by pid descending discharge_year;
run;
data want;
 set have;
 by pid;
 retain found;
 if first.pid then found=0;
 if reason_to_leave='death' and found then reason_to_leave='DK';
 if reason_to_leave='death' then found=1;
 drop found;
run;
proc sort data=want;
 by pid discharge_year;
run;
PROC Star
Posts: 7,356

Re: Use Retain to correct values based on the previous record

I'd approach the problem as follows:

 

data have;
  input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
  datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015 .    .
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015 .    .
;
run;

proc sort data=have;
  by pid discharge_year;
run;

data want;
  do until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' then latest_death=discharge_year;
  end;
  do until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' and discharge_year ne latest_death then reason_to_leave='DK';
    output;
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

Valued Guide
Posts: 797

Re: Use Retain to correct values based on the previous record



Solph wrote:

I have vertical data recording a person 's in and out of hospital and reason for leaving. I'm trying to fix people who have multiple deaths.


 

and later wrote


 

Naturally,
1. a person can only die once (so records for ID 3 & 4 are wrong)
2. if there is subsequent records after death, death should reset as unknown (DK). (so record for #5 is wrong).

  


Occurance of multiple deaths are just subsets of rule 2 above.  It seems to me that @LaurieF's response fits the requirements exactly.  I understand the problem to mean that a death record is only acceptable as the final record. All other death records should be flagged.

 

c

x

Super User
Posts: 9,662

Re: Use Retain to correct values based on the previous record

Or use DOW + index skill, something like Arthur did.


data have;
  input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
  datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015 .    .
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015 .    .
;
run;


data want;
  do i=1 by 1 until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' then n=i;
  end;
  do j=1 by 1 until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' and j lt n then reason_to_leave='DK';
    output;
  end;
  drop i j n;
run;

Super Contributor
Posts: 251

Re: Use Retain to correct values based on the previous record

But - but - why make it so hard?

Super User
Posts: 9,662

Re: Use Retain to correct values based on the previous record


Maybe OP can clarify more about what he want.

PID=5 is different between these code.

Ask a Question
Discussion stats
  • 7 replies
  • 224 views
  • 1 like
  • 5 in conversation