BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
silversta
Calcite | Level 5

Hello all!,

 

I am attempting to merge two datasets together. Dataset 1 has two variables that indicate the beginning and end of a date period, which are roughly one month intervals (interval_begin and interval_end). Dataset 2 has dates (event_date) that I want to merge in but that line up wit the date interval (there are also other variables in the row that correspond to that event date that I will need in the row of data). I've tried to illustrate an example below of what I want the data to look like (id is in both datasets). In the example below, rows, 2 and 3 have the same intervals but different event dates. My ultimate goal is to be able to sum the number of unique event dates within that monthly interval (so 2 would be the sum for interval 2/3/2008-3/2/2008). The sum would be zero for interval 1/1/2008-2/2/2008 and would be 1 for the interval 3/3/2008-4/3/2008. 

 

id     interval_begin   interval_end       event_date

1       1/1/2008           2/2/2008             .

1       2/3/2008           3/2/2008             2/15/2008

1       2/3/2008           3/2/2008             2/17/2008

1       3/3/2008           4/3/2008             3/21/2008

1       4/4/2008           5/1/2008             .

2      11/4/2010          12/5/2010           .

2      12/6/2010          1/6/2011             .

2       1/7/2011           2/8/2011             2/4/2011

 

I have tried doing a simple merge by id but this is what I get. 

 

id     interval_begin   interval_end     event_date

1       1/1/2008           2/2/2008             2/15/2008

1       2/3/2008           3/2/2008             2/17/2008

1       3/3/2008           4/3/2008             3/21/2008

1       4/4/2008           5/1/2008             3/21/2008

2      11/4/2010          12/5/2010           2/4/2011

2      12/6/2010          1/6/2011             2/4/2011

2       1/7/2011           2/8/2011             2/4/2011

 

What is the best way to get my data into my desired structure so that I can produce the sums that I need? I'm going to have to merge with a 3rd dataset with another date variable but figured if I could get this to work with datasets 1 and 2, I can do this again with the third set. Any help would be most appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
smantha
Lapis Lazuli | Level 10

proc sql;

select a.*,b.*

from ds1 a

left join 

ds2 b

on b.event_date between a.interval_begin and b.interval_end

and a.id = b.id;

quit;

View solution in original post

4 REPLIES 4
smantha
Lapis Lazuli | Level 10

Many to many merge on a SAS data step does not give the desired results.

Can you add a samples of dataset 1 and a sample of dataset 2 and then the desired output.

silversta
Calcite | Level 5

Absolutely!

 

Dataset 1 looks like this:

 

id     interval_begin   interval_end   var1 var2 var3 var4....... and so forth    

1       1/1/2008           2/2/2008             

1       2/3/2008           3/2/2008                         

1       3/3/2008           4/3/2008            

1       4/4/2008           5/1/2008             

2      11/4/2010          12/5/2010           

2      12/6/2010          1/6/2011             

2       1/7/2011           2/8/2011             

 

Dataset 2 looks like this:

 

id           event_date  var_a var_b var_c...... and so forth

1             2/15/2008

1             2/17/2008

1             3/21/2008

2             2/4/2011

 

Below is the desired output:

 

id     interval_begin   interval_end       event_date   var1 var2 var3 var_a var_b var_c ...etc. 

1       1/1/2008           2/2/2008             .

1       2/3/2008           3/2/2008             2/15/2008

1       2/3/2008           3/2/2008             2/17/2008

1       3/3/2008           4/3/2008             3/21/2008

1       4/4/2008           5/1/2008             .

2      11/4/2010          12/5/2010           .

2      12/6/2010          1/6/2011             .

2       1/7/2011           2/8/2011             2/4/2011

smantha
Lapis Lazuli | Level 10

proc sql;

select a.*,b.*

from ds1 a

left join 

ds2 b

on b.event_date between a.interval_begin and b.interval_end

and a.id = b.id;

quit;

silversta
Calcite | Level 5

This worked beautifully, thank you!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1053 views
  • 1 like
  • 2 in conversation