Hello,
Please see below is an example of the messy data I have. A member has records for mutiple hospitalizations; for seemingly one hospitalization, there were multiple rows of records with slightly different admission dates and discharge dates.
I hope to clean the data and get one record for one hospitalization. When there are mutiple rows for seemingly one hospitalization because of the overlaping time, create one record with earliest admission date and latest discharge date.
Raw data:
MemberID admission_date discharge_date
aaaa 9/11/2007 9/22/2007
aaaa 9/11/2007 9/26/2007
aaaa 10/11/2007 10/26/2007
aaaa 4/1/2008 4/14/2008
aaaa 4/7/2008 5/8/2008
aaaa 4/17/2008 4/28/2008
Hope to clean the data as
MemberID admission_date discharge_date
aaaa 9/11/2007 9/26/2007
aaaa 10/11/2007 10/26/2007
aaaa 4/1/2008 5/8/2008
I couldn't figure out a way to program this. Does anybody know? Thanks a lot.
... View more