BookmarkSubscribeRSS Feed
Solph
Pyrite | Level 9

I have vertical data recording a person 's in and out of hospital and reason for leaving. I'm trying to fix people who have multiple deaths. The data is vertical: A person (PID) moved from one hospital to another (HospID), and within the same hospital, multiple records to show people's leaving and returning. See below for a sample of 5 cases. Also the actual data is date, not year, and there is no date comflict). 

 

Naturally,
1. a person can only die once (so records for ID 3 & 4 are wrong)
2. if there is subsequent records after death, death should reset as unknown (DK). (so record for #5 is wrong).

 

So ID 1 and 2 are OK. The ^1st record of ID 3^, the ^2nd record of ID 4^ and the ^2nd record of ID 5^ should be set as DK.
I was thinking to use lag function but it doesn't do the work (or can be messy). 
It seems I should use retain to retain value from the previous record (aftert sorting the data and set discharge year missing to be say 2019) and then reset the value as I move from one record to the next. But how do I execute?

 

Thanks in advance for any help you can give me.

 

data have; input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015
;
run;
proc print noobs; run;

 

7 REPLIES 7
LaurieF
Barite | Level 11

Well, this does what you are asking for:

data want;
set have;
by pid;
if not last.pid and reason_to_leave = 'death' then
   reason_to_leave = 'DK';
run;

Is it as simple as that?

 

Ksharp
Super User

Better post the output too.

 

 
data have; input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015
;
run;
proc sort data=have;
 by pid descending discharge_year;
run;
data want;
 set have;
 by pid;
 retain found;
 if first.pid then found=0;
 if reason_to_leave='death' and found then reason_to_leave='DK';
 if reason_to_leave='death' then found=1;
 drop found;
run;
proc sort data=want;
 by pid discharge_year;
run;
art297
Opal | Level 21

I'd approach the problem as follows:

 

data have;
  input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
  datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015 .    .
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015 .    .
;
run;

proc sort data=have;
  by pid discharge_year;
run;

data want;
  do until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' then latest_death=discharge_year;
  end;
  do until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' and discharge_year ne latest_death then reason_to_leave='DK';
    output;
  end;
run;

HTH,

Art, CEO, AnalystFinder.com

 

mkeintz
PROC Star


@Solph wrote:

I have vertical data recording a person 's in and out of hospital and reason for leaving. I'm trying to fix people who have multiple deaths.


 

and later wrote


 

Naturally,
1. a person can only die once (so records for ID 3 & 4 are wrong)
2. if there is subsequent records after death, death should reset as unknown (DK). (so record for #5 is wrong).

  


Occurance of multiple deaths are just subsets of rule 2 above.  It seems to me that @LaurieF's response fits the requirements exactly.  I understand the problem to mean that a death record is only acceptable as the final record. All other death records should be flagged.

 

c

x

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
Or use DOW + index skill, something like Arthur did.


data have;
  input PID 1 HospID $ 3 admission_year 5-8 discharge_year 10-13 reason_to_leave $ 15-21;
  datalines;
1 A 2012 2013 tempout
1 A 2013 2014 moved
1 B 2015 2016 death
2 A 2012 2013 tempout
2 A 2013 2014 left
2 B 2015 .    .
3 A 2012 2013 death
3 A 2013 2014 moved
3 B 2015 2016 death
4 A 2012 2013 moved
4 B 2013 2014 death
4 B 2014 2015 tempout
4 B 2015 2016 death
5 A 2012 2014 tempout
5 A 2015 2016 death
5 B 2013 2014 moved
5 C 2015 .    .
;
run;


data want;
  do i=1 by 1 until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' then n=i;
  end;
  do j=1 by 1 until (last.pid);
    set have;
    by pid;
    if reason_to_leave eq 'death' and j lt n then reason_to_leave='DK';
    output;
  end;
  drop i j n;
run;

LaurieF
Barite | Level 11

But - but - why make it so hard?

Ksharp
Super User

Maybe OP can clarify more about what he want.

PID=5 is different between these code.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1537 views
  • 1 like
  • 5 in conversation