Re: Comparing two dates in different tables

bob2012 · Posted 10-11-2012 07:47 AM

Hi folks,

Can someone help me with this please?

I have two tables with dates in each and need to compare the two dates, when comparing I also need to count where the difference between the two dates is within 30 seconds of each another.

The two tables have tens of tousands of dates.

Any ideas how to do this please?

Thanks,

Zubair

Doc_Duke · Posted 10-11-2012 09:07 AM

Zubair,

First, I assume you mean date-time variables, not dates.

The direct way is with SQL. However, it requires multiple passes of the data.

PROC SQL; * untested code;

SELECT count(*)

FROM a, b

WHERE a.datetime BETWEEN b.datetime - 30 AND b.datetime+30;

QUIT;

RUN;

This would give a count of the number of a.datetime's that are within 30 seconds of any b.datetime.

Somehow, I don't really think that is what your question is. Perhaps you could post a sample of the data you have and the results you want.

bob2012 · Posted 10-11-2012 09:56 AM

Hi Doc,

Thanks for responding!

Yes, I'm comparing date time variables and not dates.

Please see attached a sample of the data I'm working with.

I'd like to compare Date Modified (DM) to Event Date (ED) and check if the difference in seconds between the two is greater than 30. At the same time I need to count by Country code.

So, for example, the result for DM obs1 (30/09/2012 09:45:10) would be UK = 1 and US = 1.

The step would then compare DM obs 2 with all the records in ED.

Obs Date_Modified
1      30/09/2012 09:45:10
2      30/09/2012 09:46:15
3      30/09/2012 08:58:20
4      30/09/2012 09:03:25
5      30/09/2012 09:15:30

Obs   Event_date                  Country
1       30/9/2012 09:45:32       UK
2       30/9/2012 09:51:00       US
3       30/9/2012 09:45:27       US
4       30/9/2012 09:56:00       UK
5       29/9/2012 09:45:15       US

Does this help?

Thanks

Zubair

acerishi · Posted 10-11-2012 02:49 PM

Well.. Following is one way.. .

add one rownum column to each dataset temporarily..

merge those by rownum..

keep only required columns

if (abs(date1-date2)>=30) then output;

tlt · Posted 10-11-2012 05:07 PM

Hi all,

Doc is right.SQL may require multiple steps.

And this is to be taken in account when you are dealing with ten of thousands of observations and want to compare all observations from one dataset with all observations from another dataset. This is what we call a cartesian product. "Cartesian Product, you must be out of your mind", some may say.

Still, I like the SQL approach.

Let's say A is your first dataset and B the second. I would adapt Doc's code like this

PROC SQL; * untested code;

CREATE TABLE step1 AS

SELECT A.Date_Modified, B.Country,

SUM(

CASE

WHEN A.Date_Modified BETWEEN B.Event_date-'0:0:30't AND B.Event_date+'0:0:30't then 1

ELSE 0

) AS Nb_Selected

FROM A, B

QUIT;

This would give a table like this

Date_Modified Country Nb_Selected

30/09/2012 09:45:10 UK 1

30/09/2012 09:45:10 US 1

You may want to test this code with a sample, lets'say 1000 obs from each table with the PROC SQL INOBS options :

PROC SQL INOBS=1000;

If the intermediary result is not necessary, you may want to CREATE VIEW step1 instead of CREATE TABLE step1

Next, you'll have to transpose the intermediary result

PROC TRANSPOSE DATA=step1 OUT=result NAME=Country;

VAR Nb_Selected;

RUN;

to get what you want

Date_Modified UK US

30/09/2012 09:45:10 1 1

HTH

Comparing two dates in different tables