- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have two data sets and want to do a one to many merge to get the Material field below. The problem is that the field I want to join with is not a complete field - it is only part of the join in the other data set. Is there a way to join using a like or contains within the join? I read that an inner join would work but it does not. Thank you!
PLANNER_FILE
Prom_Cd Level3_A
C-00118868 ACCIDENT
C-00120397 HANDBOOK
MATERIAL_DATA
LEVEL3 MATERIAL
CANADIAN HANDBOOKS 892
CANADIAN HANDBOOKS 1015
CANADIAN HANDBOOKS 13546
CANADIAN HANDBOOKS 13555
CANADIAN HANDBOOKS 41968
FMCSR HANDBOOK 765
FMCSR HANDBOOK 7545
FMCSR HANDBOOK 43212
FMCSR HANDBOOK 4003162
FMCSR HANDBOOK 4011027
HANDBOOK KITS 37649
HANDBOOK KITS 37651
HANDBOOK KITS 37652
HANDBOOK KITS 37653
HANDBOOK KITS 37655
HANDBOOK KITS 37657
HANDBOOK KITS 37658
Current Code - no output
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_FIX_PLANNER_FILE_0002 AS
SELECT t1.PROMO_CODE,
t1.LEVEL3_A,
t2.LEVEL3,
t2.MATERIAL AS MATERIAL1
FROM WORK.PLANNER_FILE t1
INNER JOIN WORK.MATERIAL_DATA t2 ON (t1.LEVEL3_A = t2.LEVEL3);
QUIT;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
please try FROM WORK.PLANNER_FILE t1
INNER JOIN WORK.MATERIAL_DATA t2 ON find(t2.LEVEL3,trim(t1.LEVEL3_A) ) > 0
if you see result with this use distinct in your select
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
untested code
FROM WORK.PLANNER_FILE t1
INNER JOIN WORK.MATERIAL_DATA t2 ON find(t2.LEVEL3,t1.LEVEL3_A ) > 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I tried the find statement. It is not erroring out but I am getting the following message in the log and no output.
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
NOTE: Table WORK.QUERY_FOR_FIX_PLANNER_FILE_0002 created, with 0 rows and 4 columns.
42 QUIT;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Any join on like condition will be always a Cartesian product. Currently I do not have SAS handy, so I cannot test any code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
but no output.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
please try FROM WORK.PLANNER_FILE t1
INNER JOIN WORK.MATERIAL_DATA t2 ON find(t2.LEVEL3,trim(t1.LEVEL3_A) ) > 0
if you see result with this use distinct in your select
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yes that did it! Thank you so much.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
https://communities.sas.com/t5/SAS-Enterprise-Guide/Join-using-a-partial-match/td-p/172335
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
In your SQL join you can use =* to mean like, but your question is a bit unclear. What are you expecting back if you have the data shown in your post, what's the sample output?
There's a really good example (don't have time to look right now) on SQL fuzzy matching from I think either Tom or MKeintz.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you not try using the CONTAINS operator? Remember to trim the trailing spaces from the small string that you are looking for in the larger string.
data PLANNER_FILE;
length Promo_Code Level3_A $20 ;
input Promo_Code Level3_A ;
cards;
C-00118868 ACCIDENT
C-00120397 HANDBOOK
;
data MATERIAL_DATA;
infile cards dsd dlm='|';
length LEVEL3 $20 MATERIAL 8 ;
input level3 material ;
cards;
CANADIAN HANDBOOKS| 892
CANADIAN HANDBOOKS| 1015
CANADIAN HANDBOOKS| 13546
CANADIAN HANDBOOKS| 13555
CANADIAN HANDBOOKS| 41968
FMCSR HANDBOOK| 765
FMCSR HANDBOOK| 7545
FMCSR HANDBOOK| 43212
FMCSR HANDBOOK| 4003162
FMCSR HANDBOOK| 4011027
HANDBOOK KITS| 37649
HANDBOOK KITS| 37651
HANDBOOK KITS| 37652
HANDBOOK KITS| 37653
HANDBOOK KITS| 37655
HANDBOOK KITS| 37657
HANDBOOK KITS| 37658
;
PROC SQL;
CREATE TABLE WORK.QUERY_FOR_FIX_PLANNER_FILE_0002 AS
SELECT t1.PROMO_CODE
, t1.LEVEL3_A
, t2.LEVEL3
, t2.MATERIAL AS MATERIAL1
FROM WORK.PLANNER_FILE t1
INNER JOIN WORK.MATERIAL_DATA t2
ON (t2.LEVEL3 contains trim(t1.Level3_A))
;
QUIT;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Running that code I do get the message "The execution of this query involves performing one or more Cartesian product joins that can not be optimized." yet shows the 'correct' result. Should I be worried when I run a similar JOIN with more than 100,000 rows to join?