Hi,
Here is a dataset, how can I select rows if the event_time closest to date measured within one month. For example, for id = 14022, the date is 6/22/2010, I want to select reows that event_time should be in 5/22/2010 to 7/22/2010. (if there are two measures within one month, choose the closest one. if there is no measure within one month, then NA.). Any clue or idea?
data abc;
input @1 ID @8 Date mmddyy10. @20 time anydtdtm.;
format date mmddyy10. time MDYAMPM.;
datalines;
14022 6/22/2010 10/11/2004 9:02
14022 6/22/2010 3/25/2008 12:53
14022 6/22/2010 4/29/2008 9:04
14022 6/22/2010 6/17/2008 9:14
14022 6/22/2010 7/8/2008 10:41
14022 6/22/2010 8/12/2008 12:30
14022 6/22/2010 9/16/2008 14:24
14022 6/22/2010 10/21/2008 1:44
14022 6/22/2010 11/18/2008 2:11
14022 6/22/2010 12/30/2008 9:53
14022 6/22/2010 2/10/2009 13:07
14022 6/22/2010 4/7/2009 9:45
14022 6/22/2010 6/2/2009 7:35
14022 6/22/2010 7/28/2009 10:11
14022 6/22/2010 9/29/2009 10:46
14022 6/22/2010 12/1/2009 10:51
14022 6/22/2010 1/26/2010 8:49
14022 6/22/2010 4/13/2010 7:48
14022 6/22/2010 6/22/2010 7:46
14032 2/7/2012 12/24/2002 10:25
14032 2/7/2012 1/21/2003 12:00
14032 2/7/2012 1/28/2003 9:25
14032 2/7/2012 1/29/2008 9:00
14032 2/7/2012 3/20/2008 13:52
14032 2/7/2012 5/1/2008 8:53
14032 2/7/2012 6/12/2008 14:01
14032 2/7/2012 7/24/2008 15:00
14032 2/7/2012 9/11/2008 11:02
14032 2/7/2012 9/20/2011 9:25
14032 2/7/2012 11/1/2011 10:40
14032 2/7/2012 12/22/2011 12:58
14032 2/7/2012 2/7/2012 10:20
14032 2/7/2012 3/20/2012 9:18
14032 2/7/2012 5/15/2012 12:43
14032 2/7/2012 6/19/2012 9:19
14032 2/7/2012 7/31/2012 11:48
14032 2/7/2012 9/18/2012 12:15
14032 2/7/2012 10/16/2012 12:35
14032 2/7/2012 10/18/2012 11:12
14032 2/7/2012 10/19/2012 5:00
14032 2/7/2012 10/20/2012 4:33
14032 2/7/2012 10/21/2012 4:19
14032 2/7/2012 10/22/2012 4:30
14032 2/7/2012 10/23/2012 5:03
14032 2/7/2012 10/24/2012 3:14
14032 2/7/2012 10/30/2012 11:10
14032 2/7/2012 11/27/2012 12:30
14032 2/7/2012 1/8/2013 9:58
14032 2/7/2012 2/19/2013 10:07
14032 2/7/2012 4/2/2013 13:45
14032 2/7/2012 5/14/2013 10:18
14601 1/30/2014 4/26/2005 8:40
14601 1/30/2014 6/28/2005 9:05
14601 1/30/2014 7/26/2005 8:21
14601 1/30/2014 8/9/2005 8:24
14601 1/30/2014 8/23/2005 8:10
14601 1/30/2014 10/18/2005 9:20
14601 1/30/2014 12/13/2005 9:32
14601 1/30/2014 2/7/2006 9:04
14601 1/30/2014 2/16/2006 8:44
14601 1/30/2014 5/2/2006 10:31
14601 1/30/2014 7/12/2006 7:31
14601 1/30/2014 9/7/2006 8:55
14601 1/30/2014 10/31/2006 8:43
14601 1/30/2014 2/6/2007 9:15
14601 1/30/2014 5/3/2007 7:54
14601 1/30/2014 6/28/2007 7:52
14601 1/30/2014 8/23/2007 8:10
14601 1/30/2014 10/18/2007 8:17
14601 1/30/2014 11/19/2007 2:07
14601 1/30/2014 12/13/2007 8:16
14601 1/30/2014 1/24/2008 10:23
14601 1/30/2014 1/31/2008 10:07
14601 1/30/2014 2/12/2008 12:31
14601 1/30/2014 3/25/2008 9:52
;
run;
Thanks,
I modified your example data a bit so that there would be some records that allowed for a better test. Does the following do what you want?
data abc; informat date mmddyy10.; informat time anydtdtm16.; input ID Date time; format date mmddyy10. time MDYAMPM.; datalines; 14022 6/22/2010 10/11/2004 9:02 14022 6/22/2010 3/25/2008 12:53 14022 6/22/2010 4/29/2008 9:04 14022 6/22/2010 6/17/2008 9:14 14022 6/22/2010 7/8/2008 10:41 14022 6/22/2010 8/12/2008 12:30 14022 6/22/2010 9/16/2008 14:24 14022 6/22/2010 10/21/2008 1:44 14022 6/22/2010 11/18/2008 2:11 14022 6/22/2010 12/30/2008 9:53 14022 6/22/2010 2/10/2009 13:07 14022 6/22/2010 4/7/2009 9:45 14022 6/22/2010 6/2/2009 7:35 14022 6/22/2010 7/28/2009 10:11 14022 6/22/2010 9/29/2009 10:46 14022 6/22/2010 12/1/2009 10:51 14022 6/22/2010 1/26/2010 8:49 14022 6/22/2010 5/23/2010 7:48 14022 6/22/2010 6/22/2010 7:46 14032 2/7/2012 7/24/2002 10:25 14032 2/7/2012 1/21/2003 12:00 14032 2/7/2012 1/28/2003 9:25 14032 2/7/2012 1/29/2008 9:00 14032 2/7/2012 3/20/2008 13:52 14032 2/7/2012 5/1/2008 8:53 14032 2/7/2012 6/12/2008 14:01 14032 2/7/2012 7/24/2008 15:00 14032 2/7/2012 9/11/2008 11:02 14032 2/7/2012 9/20/2011 9:25 14032 2/7/2012 11/1/2011 10:40 14032 2/7/2012 1/8/2012 12:58 14032 2/7/2012 2/7/2012 10:20 14032 2/7/2012 3/7/2012 9:18 14032 2/7/2012 5/15/2012 12:43 14032 2/7/2012 6/19/2012 9:19 14032 2/7/2012 7/31/2012 11:48 14032 2/7/2012 9/18/2012 12:15 14032 2/7/2012 10/16/2012 12:35 14032 2/7/2012 10/18/2012 11:12 14032 2/7/2012 10/19/2012 5:00 14032 2/7/2012 10/20/2012 4:33 14032 2/7/2012 10/21/2012 4:19 14032 2/7/2012 10/22/2012 4:30 14032 2/7/2012 10/23/2012 5:03 14032 2/7/2012 10/24/2012 3:14 14032 2/7/2012 10/30/2012 11:10 14032 2/7/2012 11/27/2012 12:30 14032 2/7/2012 1/8/2013 9:58 14032 2/7/2012 2/19/2013 10:07 14032 2/7/2012 4/2/2013 13:45 14032 2/7/2012 5/14/2013 10:18 14601 1/30/2014 4/26/2005 8:40 14601 1/30/2014 6/28/2005 9:05 14601 1/30/2014 7/26/2005 8:21 14601 1/30/2014 8/9/2005 8:24 14601 1/30/2014 8/23/2005 8:10 14601 1/30/2014 10/18/2005 9:20 14601 1/30/2014 12/29/2013 9:32 14601 1/30/2014 12/30/2013 9:04 14601 1/30/2014 2/16/2014 8:44 14601 1/30/2014 5/2/2006 10:31 14601 1/30/2014 7/12/2006 7:31 14601 1/30/2014 9/7/2006 8:55 14601 1/30/2014 10/31/2006 8:43 14601 1/30/2014 2/6/2007 9:15 14601 1/30/2014 5/3/2007 7:54 14601 1/30/2014 6/28/2007 7:52 14601 1/30/2014 8/23/2007 8:10 14601 1/30/2014 10/18/2007 8:17 14601 1/30/2014 11/19/2007 2:07 14601 1/30/2014 12/13/2007 8:16 14601 1/30/2014 1/24/2008 10:23 14601 1/30/2014 1/31/2008 10:07 14601 1/30/2014 2/12/2008 12:31 14601 1/30/2014 3/25/2008 9:52 ; run; proc sql noprint; create table want as select * from abc where intnx('month',date,-1,'s')<=datepart(time)<=intnx('month',date,1,'s') group by id having abs(datepart(time)-date) eq min(abs(datepart(time)-date)) ; quit;
Art, CEO, AnalystFinder.com
Hi art297,
I tried to use the slolution you helped me, it works!! but when I use my real data (I have lots of columns and rows) it doesn't work.
it shows " The query requires remerging summary statistics back with the original data." in the log. I doubt. Do you know where is the problem? Thanks,
L
1455 proc sql noprint; 1456 create table want as 1457 select mrn, event_end_UA, event_end_dt 1458 from combine_CBC 1459 where intnx('month', event_end_UA, -1, 's') <= datepart(event_end_dt) <= 1459! intnx('month', event_end_UA, 1, 's') 1460 group by mrn, event_end_UA 1461 having abs(datepart(event_end_dt) - event_end_UA) = min(abs(datepart(event_end_UA)- 1461! event_end_dt)) 1462 ; NOTE: The query requires remerging summary statistics back with the original data. NOTE: Table WORK.WANT created, with 0 rows and 3 columns. 1463 quit; NOTE: PROCEDURE SQL used (Total process time): real time 0.03 seconds cpu time 0.04 seconds
That is a normal message from proc sql. It doesn't mean that there was a problem.
However, since it didn't select any rows, it indicates that you don't have any records that meet the condition that you specified.
You'd have to post some of your actual data for us to understand.
Are you sure that event_end_UA is a date, rather than a datetime variable?
Art, CEO, AnalystFinder.com
Your example data does not have a variable named event_time only "time". So I suspose that's the one you want to use.
Is the time of day supposed to be considered when picking "closest"? if so, since there is no time component with your date variable which time of day would you want?
Note that we can't run your code as pasted as the window reformatted the columns so the time actually starts in column 16 or 17 instead of 20.
Please post code in a code box using the forum {i} menu icon if you are going to post anything where columns might be important.
This line of code added to the read will reduce the data set to only datetimes where the date part is within you specified range.
if intnx('month',date,-1,'S') le datepart(time) le intnx('month',date,+1,'S');
And this will determine the number of days different.
DaysDifferent = abs(date- datepart(time));
Sort the resulting data by ID and DaysDifferent and keep the first.
Assuming the above two lines were added to your code:
proc sort data=abc; by id daydifferent; run; data want; set abc; by id; if first.id; run;
Would result in one row per ID.
If the there are two days that are exactly the same number of days from the date and also the smallest I think this will pick the early date of time.
Thanks for your explanation, Ballardw! That's really my goal!!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.