Hello, I am using gedscore to join two datasets by name. (I am really trying to do a fuzzy join to look for duplicate names between the two datasets). I keep getting the error " NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized" Here is my code: %let maxscore=201;
proc sql;
create table survey_trackingdups as
select a.* , b.*,
compged(a.name2,b.name1,&maxscore,'iL' ) as gedscore
from tracking as a, survey as b
where calculated gedscore < &maxscore
order by calculated gedscore;
quit; I previously formatted both datasets to have the same format for name1 and name2. Any idea what is going on with the join? This code has worked for me before while joining other datasets. I have looked into this error and can't find any helpful solutions on other community blog posts. Thanks, Clare
... View more