BookmarkSubscribeRSS Feed
lim_6
Fluorite | Level 6

This question is very similar to the "Find the closest event date prior to another date; sets linked by same ID" post.

 

I have a dataset with individual IDs and key dates. Each individual has at least one key date, but may have more. I have another dataset with the ID and multiple event dates (each is an observation) for each individual. I need to find the event date that is closest to (before) or the same as the key date for each individual. Individuals may have more than 20 event dates. I tried using the syntax in the post mentioned above, but it gets rid of the additional key dates and I need all key dates to match up with the closest event date.

 

ID & Key date:

ID Keydate

1  01/11/2005

1  05/13/2005

1  09/21/2006

2  04/21/2008

3  05/03/2007

3  05/09/2007

 

ID & Event date:

ID Eventdate

1 10/09/2004

1 12/15/2004

1 03/11/2005

1 04/18/2005

1 05/13/2006

1 08/05/2007

2 03/15/2008

2 03/17/2008

2 04/14/2008

2 04/27/2008

2 05/29/2009

3 02/10/2004

3 01/04/2007

3 03/21/2007

3 08/14/2007

7 REPLIES 7
PGStats
Opal | Level 21

Straightforward in SQL:

 

data keys;
input ID keyDate :mmddyy10.;
format keyDate yymmdd10.;
datalines;
1  01/11/2005
1  05/13/2005
1  09/21/2006
2  04/21/2008
3  05/03/2007
3  05/09/2007
;

data events;
input ID eventDate :mmddyy10.;
format eventDate yymmdd10.;
datalines;
1 10/09/2004
1 12/15/2004
1 03/11/2005
1 04/18/2005
1 05/13/2006
1 08/05/2007
2 03/15/2008
2 03/17/2008
2 04/14/2008
2 04/27/2008
2 05/29/2009
3 02/10/2004
3 01/04/2007
3 03/21/2007
3 08/14/2007
;

proc sql;
create table want as
select 
    keys.ID,
    keyDate,
    max(eventDate) as lastEvent format=yymmdd10.
from
    keys inner join 
    events on 
        keys.ID=events.ID and 
        events.eventDate <= keys.keyDate
group by keys.ID, keys.keyDate;
select * from want;
quit;

PG
lim_6
Fluorite | Level 6

PGStats, the syntax you provided matches the key date to the maximum eventdate for each identifier, which is not what I'm looking for. I need each key event to be matched to an event date that occurs prior or on the same day as the key event. For many individuals, some of their event dates can happen well beyond each key event and so pulling the max(EventDate) per identifier isn't helping me find the event date closest to the key event. Not sure if that makes complete sense so if you need me to elaborate, I'd be more than happy to do so. Thanks!

PGStats
Opal | Level 21

Please provide expected output.

PG
lim_6
Fluorite | Level 6

I am looking for the eventDate that is closest to and precedes keyDate

 

ID  keyDate      eventDate
1  01/11/2005 12/15/2004
1  05/13/2005 04/18/2005
1  09/21/2006 05/13/2006
2  04/21/2008 03/17/2008
3  05/03/2007 03/21/2007
3  05/09/2007 03/21/2007

PGStats
Opal | Level 21

That's exactly what you get with my code.

PG
lim_6
Fluorite | Level 6

PGStats, do I need to sort each dataset by SSN and key/event date first and then use your syntax? Or would it not matter? For some reason I wasn't getting that same outcome. Thanks!

PGStats
Opal | Level 21

proc SQL does its own sorting, when needed. No need to presort the datasets.

PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1974 views
  • 0 likes
  • 2 in conversation