BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
silversta
Calcite | Level 5

Hello all!,

 

I am attempting to merge two datasets together. Dataset 1 has two variables that indicate the beginning and end of a date period, which are roughly one month intervals (interval_begin and interval_end). Dataset 2 has dates (event_date) that I want to merge in but that line up wit the date interval (there are also other variables in the row that correspond to that event date that I will need in the row of data). I've tried to illustrate an example below of what I want the data to look like (id is in both datasets). In the example below, rows, 2 and 3 have the same intervals but different event dates. My ultimate goal is to be able to sum the number of unique event dates within that monthly interval (so 2 would be the sum for interval 2/3/2008-3/2/2008). The sum would be zero for interval 1/1/2008-2/2/2008 and would be 1 for the interval 3/3/2008-4/3/2008. 

 

id     interval_begin   interval_end       event_date

1       1/1/2008           2/2/2008             .

1       2/3/2008           3/2/2008             2/15/2008

1       2/3/2008           3/2/2008             2/17/2008

1       3/3/2008           4/3/2008             3/21/2008

1       4/4/2008           5/1/2008             .

2      11/4/2010          12/5/2010           .

2      12/6/2010          1/6/2011             .

2       1/7/2011           2/8/2011             2/4/2011

 

I have tried doing a simple merge by id but this is what I get. 

 

id     interval_begin   interval_end     event_date

1       1/1/2008           2/2/2008             2/15/2008

1       2/3/2008           3/2/2008             2/17/2008

1       3/3/2008           4/3/2008             3/21/2008

1       4/4/2008           5/1/2008             3/21/2008

2      11/4/2010          12/5/2010           2/4/2011

2      12/6/2010          1/6/2011             2/4/2011

2       1/7/2011           2/8/2011             2/4/2011

 

What is the best way to get my data into my desired structure so that I can produce the sums that I need? I'm going to have to merge with a 3rd dataset with another date variable but figured if I could get this to work with datasets 1 and 2, I can do this again with the third set. Any help would be most appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
smantha
Lapis Lazuli | Level 10

proc sql;

select a.*,b.*

from ds1 a

left join 

ds2 b

on b.event_date between a.interval_begin and b.interval_end

and a.id = b.id;

quit;

View solution in original post

4 REPLIES 4
smantha
Lapis Lazuli | Level 10

Many to many merge on a SAS data step does not give the desired results.

Can you add a samples of dataset 1 and a sample of dataset 2 and then the desired output.

silversta
Calcite | Level 5

Absolutely!

 

Dataset 1 looks like this:

 

id     interval_begin   interval_end   var1 var2 var3 var4....... and so forth    

1       1/1/2008           2/2/2008             

1       2/3/2008           3/2/2008                         

1       3/3/2008           4/3/2008            

1       4/4/2008           5/1/2008             

2      11/4/2010          12/5/2010           

2      12/6/2010          1/6/2011             

2       1/7/2011           2/8/2011             

 

Dataset 2 looks like this:

 

id           event_date  var_a var_b var_c...... and so forth

1             2/15/2008

1             2/17/2008

1             3/21/2008

2             2/4/2011

 

Below is the desired output:

 

id     interval_begin   interval_end       event_date   var1 var2 var3 var_a var_b var_c ...etc. 

1       1/1/2008           2/2/2008             .

1       2/3/2008           3/2/2008             2/15/2008

1       2/3/2008           3/2/2008             2/17/2008

1       3/3/2008           4/3/2008             3/21/2008

1       4/4/2008           5/1/2008             .

2      11/4/2010          12/5/2010           .

2      12/6/2010          1/6/2011             .

2       1/7/2011           2/8/2011             2/4/2011

smantha
Lapis Lazuli | Level 10

proc sql;

select a.*,b.*

from ds1 a

left join 

ds2 b

on b.event_date between a.interval_begin and b.interval_end

and a.id = b.id;

quit;

silversta
Calcite | Level 5

This worked beautifully, thank you!

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1663 views
  • 1 like
  • 2 in conversation