I have two tables: TABLE_A and TABLE_B Both tables TABLE_A and TABLE_B are updated daily. I’d like to determine which observations in TABLE_A are “affected” by updates in TABLE_B. This is very specific business requirement. What makes it challenging is that there is no key that can be used to join the tables directly. The PROC SQL I have takes more than half hour to complete. See the code below. Wondering if anyone has a better solution. Note on the columns. Let’s say TABLE_B has a column called code_1, while TABLE_A has a corresponding column code_1_list. Values in code_1 column are alpha-numerical, while code_1_list values are quoted, and comma separated. For example: TABLE_B code_1 code1_A TABLE_A: code_1_list ‘code1_A’, ‘code1_B’, ‘code1_C’ I need to determine: if *any* observation in TABLE_B is in the list of corresponding column in TABLE_A, I need to include it (or keep/identify it). By the way, both tables TABLE_A and TABLE_B have 25 of these corresponding columns. Both tables TABLE_A and TABLE_B have ~15k observations on any given day. Here is a limited example (added notes for clarification): TABLE_B ID code_1 code_2 code_3 code_4 Note (corresponding to TABLE_A) 123 code1_A code2_B code3_W code4_D In ID: A1, code_1_list and A3, code_2_list 124 code1_E code2_B code3_P code4_R In ID: A3, code_2_list 125 code1_H code2_J code3_K code4_E 126 code1_A code2_P code3_A code4_K In ID: A1, code_1_list also ID: A3 code_2_list TABLE_A ID code_1_list code_2_list code_3_list code_4_list note A1 'code1_A', 'code1_C' 'code2_A', 'code2_K', 'code2_G' 'code3_F' 'code4_A', 'code4_L' keep: ID 123, code_1 in TABLE_B A2 'code1_B', 'code1_C', 'code1_J' 'code2_A', 'code2_G' 'code3_L' 'code4_I', 'code4_R' A3 'code1_A' 'code2_B', 'code2_P' 'code3_F' 'code4_A', 'code4_M' keep: ID 123, code_1 and ID 123 & 124, code_2 From the example above, only observations A1 and A3 are kept in TABLE_A. Your thoughts? It seems there might be a much simpler solution that is escaping me... SQL: proc sql; create table work.table_ab as select distinct A.* from work.TABLE_A A, work.TABLE_B B where ( A.code_1_list like (cats("'%",B.code_1,"%'")) or A.code_2_list like (cats("'%",B.code_2,"%'")) or A.code_3_list like (cats("'%",B.code_3,"%'")) or A.code_4_list like (cats("'%",B.code_4,"%'")) ) ; quit;
... View more