DATA Step, Macro, Functions and more

select rows within one month of a time

Reply
Frequent Contributor
Posts: 76

select rows within one month of a time

Hi,

Here is a dataset, how can I select rows if the event_time closest to date measured within one month. For example, for id = 14022, the date is 6/22/2010, I want to select reows that event_time should be in 5/22/2010 to 7/22/2010. (if there are two measures within one month, choose the closest one. if there is no measure within one month, then NA.).   Any clue or idea?

 

data abc;
input @1 ID @8 Date mmddyy10. @20 time anydtdtm.;
format date mmddyy10. time MDYAMPM.;
datalines;
14022 6/22/2010 10/11/2004 9:02
14022 6/22/2010 3/25/2008 12:53
14022 6/22/2010 4/29/2008 9:04
14022 6/22/2010 6/17/2008 9:14
14022 6/22/2010 7/8/2008 10:41
14022 6/22/2010 8/12/2008 12:30
14022 6/22/2010 9/16/2008 14:24
14022 6/22/2010 10/21/2008 1:44
14022 6/22/2010 11/18/2008 2:11
14022 6/22/2010 12/30/2008 9:53
14022 6/22/2010 2/10/2009 13:07
14022 6/22/2010 4/7/2009 9:45
14022 6/22/2010 6/2/2009 7:35
14022 6/22/2010 7/28/2009 10:11
14022 6/22/2010 9/29/2009 10:46
14022 6/22/2010 12/1/2009 10:51
14022 6/22/2010 1/26/2010 8:49
14022 6/22/2010 4/13/2010 7:48
14022 6/22/2010 6/22/2010 7:46
14032 2/7/2012 12/24/2002 10:25
14032 2/7/2012 1/21/2003 12:00
14032 2/7/2012 1/28/2003 9:25
14032 2/7/2012 1/29/2008 9:00
14032 2/7/2012 3/20/2008 13:52
14032 2/7/2012 5/1/2008 8:53
14032 2/7/2012 6/12/2008 14:01
14032 2/7/2012 7/24/2008 15:00
14032 2/7/2012 9/11/2008 11:02
14032 2/7/2012 9/20/2011 9:25
14032 2/7/2012 11/1/2011 10:40
14032 2/7/2012 12/22/2011 12:58
14032 2/7/2012 2/7/2012 10:20
14032 2/7/2012 3/20/2012 9:18
14032 2/7/2012 5/15/2012 12:43
14032 2/7/2012 6/19/2012 9:19
14032 2/7/2012 7/31/2012 11:48
14032 2/7/2012 9/18/2012 12:15
14032 2/7/2012 10/16/2012 12:35
14032 2/7/2012 10/18/2012 11:12
14032 2/7/2012 10/19/2012 5:00
14032 2/7/2012 10/20/2012 4:33
14032 2/7/2012 10/21/2012 4:19
14032 2/7/2012 10/22/2012 4:30
14032 2/7/2012 10/23/2012 5:03
14032 2/7/2012 10/24/2012 3:14
14032 2/7/2012 10/30/2012 11:10
14032 2/7/2012 11/27/2012 12:30
14032 2/7/2012 1/8/2013 9:58
14032 2/7/2012 2/19/2013 10:07
14032 2/7/2012 4/2/2013 13:45
14032 2/7/2012 5/14/2013 10:18
14601 1/30/2014 4/26/2005 8:40
14601 1/30/2014 6/28/2005 9:05
14601 1/30/2014 7/26/2005 8:21
14601 1/30/2014 8/9/2005 8:24
14601 1/30/2014 8/23/2005 8:10
14601 1/30/2014 10/18/2005 9:20
14601 1/30/2014 12/13/2005 9:32
14601 1/30/2014 2/7/2006 9:04
14601 1/30/2014 2/16/2006 8:44
14601 1/30/2014 5/2/2006 10:31
14601 1/30/2014 7/12/2006 7:31
14601 1/30/2014 9/7/2006 8:55
14601 1/30/2014 10/31/2006 8:43
14601 1/30/2014 2/6/2007 9:15
14601 1/30/2014 5/3/2007 7:54
14601 1/30/2014 6/28/2007 7:52
14601 1/30/2014 8/23/2007 8:10
14601 1/30/2014 10/18/2007 8:17
14601 1/30/2014 11/19/2007 2:07
14601 1/30/2014 12/13/2007 8:16
14601 1/30/2014 1/24/2008 10:23
14601 1/30/2014 1/31/2008 10:07
14601 1/30/2014 2/12/2008 12:31
14601 1/30/2014 3/25/2008 9:52
;
run;

 

Thanks,

 

 

PROC Star
Posts: 7,360

Re: select rows within one month of a time

I modified your example data a bit so that there would be some records that allowed for a better test. Does the following do what you want?

data abc;
  informat date mmddyy10.;
  informat time anydtdtm16.;
  input ID Date time;
  format date mmddyy10. time MDYAMPM.;
  datalines;
14022 6/22/2010 10/11/2004 9:02
14022 6/22/2010 3/25/2008 12:53
14022 6/22/2010 4/29/2008 9:04
14022 6/22/2010 6/17/2008 9:14
14022 6/22/2010 7/8/2008 10:41
14022 6/22/2010 8/12/2008 12:30
14022 6/22/2010 9/16/2008 14:24
14022 6/22/2010 10/21/2008 1:44
14022 6/22/2010 11/18/2008 2:11
14022 6/22/2010 12/30/2008 9:53
14022 6/22/2010 2/10/2009 13:07
14022 6/22/2010 4/7/2009 9:45
14022 6/22/2010 6/2/2009 7:35
14022 6/22/2010 7/28/2009 10:11
14022 6/22/2010 9/29/2009 10:46
14022 6/22/2010 12/1/2009 10:51
14022 6/22/2010 1/26/2010 8:49
14022 6/22/2010 5/23/2010 7:48
14022 6/22/2010 6/22/2010 7:46
14032 2/7/2012 7/24/2002 10:25
14032 2/7/2012 1/21/2003 12:00
14032 2/7/2012 1/28/2003 9:25
14032 2/7/2012 1/29/2008 9:00
14032 2/7/2012 3/20/2008 13:52
14032 2/7/2012 5/1/2008 8:53
14032 2/7/2012 6/12/2008 14:01
14032 2/7/2012 7/24/2008 15:00
14032 2/7/2012 9/11/2008 11:02
14032 2/7/2012 9/20/2011 9:25
14032 2/7/2012 11/1/2011 10:40
14032 2/7/2012 1/8/2012 12:58
14032 2/7/2012 2/7/2012 10:20
14032 2/7/2012 3/7/2012 9:18
14032 2/7/2012 5/15/2012 12:43
14032 2/7/2012 6/19/2012 9:19
14032 2/7/2012 7/31/2012 11:48
14032 2/7/2012 9/18/2012 12:15
14032 2/7/2012 10/16/2012 12:35
14032 2/7/2012 10/18/2012 11:12
14032 2/7/2012 10/19/2012 5:00
14032 2/7/2012 10/20/2012 4:33
14032 2/7/2012 10/21/2012 4:19
14032 2/7/2012 10/22/2012 4:30
14032 2/7/2012 10/23/2012 5:03
14032 2/7/2012 10/24/2012 3:14
14032 2/7/2012 10/30/2012 11:10
14032 2/7/2012 11/27/2012 12:30
14032 2/7/2012 1/8/2013 9:58
14032 2/7/2012 2/19/2013 10:07
14032 2/7/2012 4/2/2013 13:45
14032 2/7/2012 5/14/2013 10:18
14601 1/30/2014 4/26/2005 8:40
14601 1/30/2014 6/28/2005 9:05
14601 1/30/2014 7/26/2005 8:21
14601 1/30/2014 8/9/2005 8:24
14601 1/30/2014 8/23/2005 8:10
14601 1/30/2014 10/18/2005 9:20
14601 1/30/2014 12/29/2013 9:32
14601 1/30/2014 12/30/2013 9:04
14601 1/30/2014 2/16/2014 8:44
14601 1/30/2014 5/2/2006 10:31
14601 1/30/2014 7/12/2006 7:31
14601 1/30/2014 9/7/2006 8:55
14601 1/30/2014 10/31/2006 8:43
14601 1/30/2014 2/6/2007 9:15
14601 1/30/2014 5/3/2007 7:54
14601 1/30/2014 6/28/2007 7:52
14601 1/30/2014 8/23/2007 8:10
14601 1/30/2014 10/18/2007 8:17
14601 1/30/2014 11/19/2007 2:07
14601 1/30/2014 12/13/2007 8:16
14601 1/30/2014 1/24/2008 10:23
14601 1/30/2014 1/31/2008 10:07
14601 1/30/2014 2/12/2008 12:31
14601 1/30/2014 3/25/2008 9:52
;
run;

proc sql noprint;
  create table want as
    select *
      from abc 
       where intnx('month',date,-1,'s')<=datepart(time)<=intnx('month',date,1,'s')
         group by id
           having abs(datepart(time)-date) eq min(abs(datepart(time)-date))
  ;
quit;

Art, CEO, AnalystFinder.com

Frequent Contributor
Posts: 76

Re: select rows within one month of a time

Hi art297,

 

I tried to use the slolution you helped me, it works!! but when I use my real data (I have lots of columns and rows) it doesn't work. 

 it shows " The query requires remerging summary statistics back with the original data." in the log. I doubt. Do you know where is the problem? Thanks,

L

 

1455  proc sql noprint;
1456      create table want as
1457          select mrn, event_end_UA, event_end_dt
1458          from combine_CBC
1459          where intnx('month', event_end_UA, -1, 's') <= datepart(event_end_dt) <=
1459! intnx('month', event_end_UA, 1, 's')
1460          group by mrn, event_end_UA
1461          having abs(datepart(event_end_dt) - event_end_UA) = min(abs(datepart(event_end_UA)-
1461! event_end_dt))
1462  ;
NOTE: The query requires remerging summary statistics back with the original data.
NOTE: Table WORK.WANT created, with 0 rows and 3 columns.

1463  quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.03 seconds
      cpu time            0.04 seconds

PROC Star
Posts: 7,360

Re: select rows within one month of a time

[ Edited ]

That is a normal message from proc sql. It doesn't mean that there was a problem.

 

However, since it didn't select any rows, it indicates that you don't have any records that meet the condition that you specified.

 

You'd have to post some of your actual data for us to understand.

 

Are you sure that event_end_UA is a date, rather than a datetime variable?

 

Art, CEO, AnalystFinder.com

 

Super User
Posts: 10,483

Re: select rows within one month of a time

Your example data does not have a variable named event_time only "time". So I suspose that's the one you want to use.

Is the time of day supposed to be considered when picking "closest"? if so, since there is no time component with your date variable which time of day would you want?

 

Note that we can't run your code as pasted as the window reformatted the columns so the time actually starts in column 16 or 17 instead of 20.

Please post code in a code box using the forum {i} menu icon if you are going to post anything where columns might be important.

 

This line of code added to the read will reduce the data set to only datetimes where the date part is within you specified range.

   if intnx('month',date,-1,'S') le datepart(time) le intnx('month',date,+1,'S');

And this will determine the number of days different.

 

   DaysDifferent = abs(date- datepart(time));

Sort the resulting data by ID and DaysDifferent and keep the first.

 

Assuming the above two lines were added to your code:

proc sort data=abc;
   by id daydifferent;
run;

data want;
   set abc;
   by id;
   if first.id;
run;

Would result in one row per ID.

 

If the there are two days that are exactly the same number of days from the date and also the smallest I think this will pick the early date of time.

Frequent Contributor
Posts: 76

Re: select rows within one month of a time

Thanks for your explanation, Ballardw! That's really my goal!!

Ask a Question
Discussion stats
  • 5 replies
  • 143 views
  • 0 likes
  • 3 in conversation