Help using Base SAS procedures

Find date closest to another date for multiple identifiers

Reply
Occasional Contributor
Posts: 5

Find date closest to another date for multiple identifiers

This question is very similar to the "Find the closest event date prior to another date; sets linked by same ID" post.

 

I have a dataset with individual IDs and key dates. Each individual has at least one key date, but may have more. I have another dataset with the ID and multiple event dates (each is an observation) for each individual. I need to find the event date that is closest to (before) or the same as the key date for each individual. Individuals may have more than 20 event dates. I tried using the syntax in the post mentioned above, but it gets rid of the additional key dates and I need all key dates to match up with the closest event date.

 

ID & Key date:

ID Keydate

1  01/11/2005

1  05/13/2005

1  09/21/2006

2  04/21/2008

3  05/03/2007

3  05/09/2007

 

ID & Event date:

ID Eventdate

1 10/09/2004

1 12/15/2004

1 03/11/2005

1 04/18/2005

1 05/13/2006

1 08/05/2007

2 03/15/2008

2 03/17/2008

2 04/14/2008

2 04/27/2008

2 05/29/2009

3 02/10/2004

3 01/04/2007

3 03/21/2007

3 08/14/2007

Respected Advisor
Posts: 4,641

Re: Find date closest to another date for multiple identifiers

Straightforward in SQL:

 

data keys;
input ID keyDate :mmddyy10.;
format keyDate yymmdd10.;
datalines;
1  01/11/2005
1  05/13/2005
1  09/21/2006
2  04/21/2008
3  05/03/2007
3  05/09/2007
;

data events;
input ID eventDate :mmddyy10.;
format eventDate yymmdd10.;
datalines;
1 10/09/2004
1 12/15/2004
1 03/11/2005
1 04/18/2005
1 05/13/2006
1 08/05/2007
2 03/15/2008
2 03/17/2008
2 04/14/2008
2 04/27/2008
2 05/29/2009
3 02/10/2004
3 01/04/2007
3 03/21/2007
3 08/14/2007
;

proc sql;
create table want as
select 
    keys.ID,
    keyDate,
    max(eventDate) as lastEvent format=yymmdd10.
from
    keys inner join 
    events on 
        keys.ID=events.ID and 
        events.eventDate <= keys.keyDate
group by keys.ID, keys.keyDate;
select * from want;
quit;

PG
Occasional Contributor
Posts: 5

Re: Find date closest to another date for multiple identifiers

PGStats, the syntax you provided matches the key date to the maximum eventdate for each identifier, which is not what I'm looking for. I need each key event to be matched to an event date that occurs prior or on the same day as the key event. For many individuals, some of their event dates can happen well beyond each key event and so pulling the max(EventDate) per identifier isn't helping me find the event date closest to the key event. Not sure if that makes complete sense so if you need me to elaborate, I'd be more than happy to do so. Thanks!

Respected Advisor
Posts: 4,641

Re: Find date closest to another date for multiple identifiers

Please provide expected output.

PG
Occasional Contributor
Posts: 5

Re: Find date closest to another date for multiple identifiers

I am looking for the eventDate that is closest to and precedes keyDate

 

ID  keyDate      eventDate
1  01/11/2005 12/15/2004
1  05/13/2005 04/18/2005
1  09/21/2006 05/13/2006
2  04/21/2008 03/17/2008
3  05/03/2007 03/21/2007
3  05/09/2007 03/21/2007

Respected Advisor
Posts: 4,641

Re: Find date closest to another date for multiple identifiers

That's exactly what you get with my code.

PG
Occasional Contributor
Posts: 5

Re: Find date closest to another date for multiple identifiers

PGStats, do I need to sort each dataset by SSN and key/event date first and then use your syntax? Or would it not matter? For some reason I wasn't getting that same outcome. Thanks!

Respected Advisor
Posts: 4,641

Re: Find date closest to another date for multiple identifiers

proc SQL does its own sorting, when needed. No need to presort the datasets.

PG
Ask a Question
Discussion stats
  • 7 replies
  • 353 views
  • 0 likes
  • 2 in conversation