Help using Base SAS procedures

a question on cleaning longitudinal data

Reply
N/A
Posts: 1

a question on cleaning longitudinal data

Hello,
Please see below is an example of the messy data I have. A member has records for mutiple hospitalizations; for seemingly one hospitalization, there were multiple rows of records with slightly different admission dates and discharge dates.

I hope to clean the data and get one record for one hospitalization. When there are mutiple rows for seemingly one hospitalization because of the overlaping time, create one record with earliest admission date and latest discharge date.

Raw data:

MemberID admission_date discharge_date
aaaa 9/11/2007 9/22/2007
aaaa 9/11/2007 9/26/2007

aaaa 10/11/2007 10/26/2007

aaaa 4/1/2008 4/14/2008
aaaa 4/7/2008 5/8/2008
aaaa 4/17/2008 4/28/2008

Hope to clean the data as

MemberID admission_date discharge_date
aaaa 9/11/2007 9/26/2007
aaaa 10/11/2007 10/26/2007
aaaa 4/1/2008 5/8/2008

I couldn't figure out a way to program this. Does anybody know? Thanks a lot.
Super Contributor
Super Contributor
Posts: 3,174

Re: a question on cleaning longitudinal data

Though not scientific, consider using a DATA step and a DO / END loop to generate one observation for each date within the range, then sort the observations in sequence and use the LAG function to compare adjacent observations, and if there is a gap of more than some determined length of days (using the INTCK function), then increment a counter. Lastly, use PROC SUMMARY to generate a MIN/MAX variable for each hospitalization.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 20

Re: a question on cleaning longitudinal data

There was actually a paper at this year's Global Forum that covered a very similar topic.

http://support.sas.com/resources/papers/proceedings09/079-2009.pdf

I hope you can find something in here to help.
Ask a Question
Discussion stats
  • 2 replies
  • 120 views
  • 0 likes
  • 3 in conversation