BookmarkSubscribeRSS Feed
peterthiem
Calcite | Level 5

Greeting all,

I need some help with checking if a transaction date falls between a begin_date and end_date column and flagging the record

The transaction table for simplicity contain 2 columns an id and transaction date with potential duplicates while a second table (no duplicate rows) contain the following information, the begin_date and end_date but can be repeated for different groups as shown below (I only shown 2 groups here -approx 50 groups in total)

2016 PERIOD 1 G '10Jan2016'd '28Jan2016'd

2016 PERIOD 1 H '10Jan2016'd '28Jan2016'd

2016 PERIOD 2 G '04Jul2016'd '19Jul2016'd

2016 PERIOD 2 H '04Jul2016'd '19Jul2016'd

2016 PERIOD 3 G '26Sep2016'd '11Oct2016'd

2016 PERIOD 3 H '26Sep2016'd '11Oct2016'd

2016 PERIOD 4 G '18Dec2016'd '31Jan2017'd

2016 PERIOD 4 H '19Dec2016'd '28Jan2017'd

My final table would contain the 2 columns (id & transaction date) plus the indicator variable 1/0 if the transaction date is indeed between those 2 dates.

TIA

Peter

6 REPLIES 6
LinusH
Tourmaline | Level 20

Just do a left join using a between - and condition. Set the flag if you got a match (or not).

Data never sleeps
Steelers_In_DC
Barite | Level 11

Is there an ID in the second table?  What are you joining the tables on?  After you join the table you can do:

Data want;

set have;

if begin_date < transaction_date < end_date then flag = 1;

     else flag = 0;

run;

LinusH
Tourmaline | Level 20

No need for a second step.

Just test on the occurrence of the id variable in the right table.

Data never sleeps
Steelers_In_DC
Barite | Level 11

Linus,

Without an ID in the second table how would you be joining?

peterthiem
Calcite | Level 5

Hi Mark and Linus the problem is that the id in the second table.is different to that of the first i.e.

PERIOD_IDENTIFIER YEAR PERIOD GROUP

1111     2016 PERIOD 1 G '10Jan2016'd '28Jan2016'd

1112     2016 PERIOD 1 H '10Jan2016'd '28Jan2016'd

1113     2016 PERIOD 2 G '04Jul2016'd '19Jul2016'd

1114     2016 PERIOD 2 H '04Jul2016'd '19Jul2016'd

1115     2016 PERIOD 3 G '26Sep2016'd '11Oct2016'd

1116     2016 PERIOD 3 H '26Sep2016'd '11Oct2016'd

1117     2016 PERIOD 4 G '18Dec2016'd '31Jan2017'd

1118     2016 PERIOD 4 H '19Dec2016'd '28Jan2017'd

I mean we can try to append all the date that are between the date range with the PERIOD_IDENTIFIER appended to the first (transaction) table but we first need to join these 2 tables somehow.

The first tables will have duplicates but the second table the identifier is unique.

The WANT data set look something like this
 
id transaction_date group PERIOD_START PERIOD_END FLAG
1     19/03/2016         G   .                      .                     0
2     21/01/2016         H   '10Jan2016'd   '28Jan2016'd  1
3     05/07/2016         H   '04Jul2016'd    '19Jul2016'd   1
4     27/09/2016         G   '26Sep2016'd   '11Oct2016'd  1
5     25/03/2016         G    .                     .                      0
6     02/01/2017         H    '19Dec2016'd '28Jan2017'd   1

So far I have attempted the following codes which returns duplicates

proc sql;

create table test as

select

a.id

, a.transaction_date

, a.group

, b.period_start

, b.period_end

, flag

from transaction_table a

left outer join

period_table b

on a.transaction_date between b.period_start and b.period_end

and a.group = b.group

quit;

Unless I'm missing out something or proc sql might not be good for this task.

gergely_batho
SAS Employee

The join will produce duplicates, if date intervals in the second dataset overlap.

On the other hand, if you decide not to include detailed information from this second table, you can aggregate the result of join:


proc sql;

create table test as

select

a.id

, a.transaction_date

, a.group

, max(b.group is not missing) as flag

from transaction_table a

left outer join

period_table b

on a.transaction_date between b.period_start and b.period_end

and a.group = b.group

group by 1,2,3

quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 3783 views
  • 0 likes
  • 4 in conversation