BookmarkSubscribeRSS Feed
bethsmith
Fluorite | Level 6

Hi All,

 

I've got a bit of a strange data situation that I'm hoping someone can help with....

Basically I have 2 datasets (A and B) with different unique identifiers that I am trying to join together via a 'lookup table' (C).


The problem is that this joining this A->B->C requires a many->one->many merge that I am struggling to find my way around.

 

Hopefully the example below will show my problem.....

 

Table A    
obs_nosale_dtcomp_dt  
     
Table B    
obs_noprop_no   
     
Table C    
prop_nomove_dt   
     
     
     
Obs_noprop_nosale_dtcomp_dtmove_dt
11201/11/201701/12/201725/11/2017
11201/03/201801/04/201814/03/2018
11201/11/201701/12/201714/03/2018
11201/03/201801/04/201825/11/2017


This is a very simplified version of what I am dealing with, but in essence I want the maximum move date for each observation/property number combination where the move_dt is before the comp_dt 

 

Any advice will be much appreciated.

 

Thank you

7 REPLIES 7
ballardw
Super User

You should provide examples of the input data sets and then the result of using those input sets. Without the actual start condition it is hard to interpret the words you describe and the color highlighting doesn't make any sense.

 

You don't need to provide a lot of records in the input sets but there should be enough to demonstrates what happens with any of the types of matches you need.

 

 

PGStats
Opal | Level 21

It looks more like you are trying to join tables A and C via table B which has fields in common with both tables (A and C). As @ballardw said, a little example would help clarify things for us. 

PG
ballardw
Super User

Maybe some similar to this:

 

proc sql
   create table want as
   select a.obs_no,d.prop_no,a.sales_dt, a.comp_dt,d.move_dt
   from tablea as a 
        left join
        (select tableb.obs_no, tablec.prop_no, tablec.move_dt
         from tableb left join tablec
         on tableb.prop_no=tablec.propno) as d
         on a.obs_no=d.obsno
   ;
quit;
bethsmith
Fluorite | Level 6

Apologies for not making that clearing.

Table A contains Sales data and Table C contains application data.

Sales are done at account level but can have multiple dates for the subaccounts within. The unique identifier in this table is the sales number.

Applications are done at property level and can also have multiple dates for the subaccounts. The unique identifier in this table is the property number.

Table B contains information at account level and has the account number and property number.

 

Trying to left/inner join the tables via sales number and property number if there are 2 observations yields 4 results as it generates all possible combinations of the observations. I can't show the actual data but a worked through example is below. For each sales date, I want the move date that is between the sales and completion dates.

 

Table A    
     
Sales_nosale_dtcomp_dt  
101/11/201701/12/2017  
101/03/201801/04/2018  
     
Table B    
Prop_nomove_dt   
1225/11/2017   
1214/03/2018   
     
Result    
Sales_noProp_nosale_dtcomp_dtmove_dt
11201/11/201701/12/201725/11/2017
11201/03/201801/04/201814/03/2018
11201/11/201701/12/201714/03/2018
11201/03/201801/04/201825/11/2017
     
Wanted    
Sales_noProp_nosale_dtcomp_dtmove_dt
11201/11/201701/12/201725/11/2017
11201/03/201801/04/201814/03/2018

 

 

andreas_lds
Jade | Level 19

If both tables have the same number of observations, you can do:

 

data wanted;
  merge table_a table_b;
  /* No BY-statement!! */
run;
ballardw
Super User

@bethsmith wrote:

Apologies for not making that clearing.

Table A contains Sales data and Table C contains application data.

Sales are done at account level but can have multiple dates for the subaccounts within. The unique identifier in this table is the sales number.

Applications are done at property level and can also have multiple dates for the subaccounts. The unique identifier in this table is the property number.

Table B contains information at account level and has the account number and property number.

 

Trying to left/inner join the tables via sales number and property number if there are 2 observations yields 4 results as it generates all possible combinations of the observations. I can't show the actual data but a worked through example is below. For each sales date, I want the move date that is between the sales and completion dates.

 

Table A        
         
Sales_no sale_dt comp_dt    
1 01/11/2017 01/12/2017    
1 01/03/2018 01/04/2018    
         
Table B        
Prop_no move_dt      
12 25/11/2017      
12 14/03/2018      
         
Result        
Sales_no Prop_no sale_dt comp_dt move_dt
1 12 01/11/2017 01/12/2017 25/11/2017
1 12 01/03/2018 01/04/2018 14/03/2018
1 12 01/11/2017 01/12/2017 14/03/2018
1 12 01/03/2018 01/04/2018 25/11/2017
         
Wanted        
Sales_no Prop_no sale_dt comp_dt move_dt
1 12 01/11/2017 01/12/2017 25/11/2017
1 12 01/03/2018 01/04/2018 14/03/2018

 

 


So what is the rule that says this one goes in the output:

1 12 01/11/2017 01/12/2017 25/11/2017

but not this one:

1 12 01/03/2018 01/04/2018 25/11/2017

 

You need to be able to state why certain combinations are wanted in the output and others excluded.

bethsmith
Fluorite | Level 6

As the moving date cannot be before a sale has been made and it needs to be before the entire process is completed.

 

And picking out that specific combination is what I'm really struggling with.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 2002 views
  • 0 likes
  • 4 in conversation