I need to run this merge in sql and I'm getting a cartesian product join. I took the step of eliminating duplicates from one of the joined datasets, but I still getting the message on the cartesian product. Any suggestions?
27
28
29 PROC SQL;
30 CREATE TABLE ICD10_2 AS
31 SELECT DISTINCT F1
32 FROM ICD10;
NOTE: Table WORK.ICD10_2 created, with 265 rows and 1 columns.
33 /* CREATE TABLE WORK.QUERY_FOR_V2_FCT_CLM AS */
34 SELECT /* COUNT_of_BENE_SK */
35 (COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
36 FROM QUERY_FOR_V2_FCT_CLM t1, WORK.ICD10_2 t2
37 WHERE (t1.CLM_PRCDR_1_CD = t2.F1
38 OR t1.CLM_PRCDR_2_CD = t2.F1);
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
39 QUIT;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.13 seconds
cpu time 0.10 seconds
I assume you just want the number of valid BENE_SK values from QUERY_FOR_V2_FCT_CLM - so it may help to rewrite the join a simple WHERE clause:
SELECT /* COUNT_of_BENE_SK */
(COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
FROM QUERY_FOR_V2_FCT_CLM t1
WHERE CLM_PRCDR_1_CD in (select F1 from WORK.ICD10_2)
or CLM_PRCDR_2_CD in(select F1 from WORK.ICD10_2);
I assume you just want the number of valid BENE_SK values from QUERY_FOR_V2_FCT_CLM - so it may help to rewrite the join a simple WHERE clause:
SELECT /* COUNT_of_BENE_SK */
(COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
FROM QUERY_FOR_V2_FCT_CLM t1
WHERE CLM_PRCDR_1_CD in (select F1 from WORK.ICD10_2)
or CLM_PRCDR_2_CD in(select F1 from WORK.ICD10_2);
Hi @Batman
Try this query, it uses Boolean expression
PROC SQL;
CREATE TABLE ICD10_2 AS
SELECT DISTINCT F1
FROM ICD10;
/* CREATE TABLE WORK.QUERY_FOR_V2_FCT_CLM AS */
SELECT /* COUNT_of_BENE_SK */
(COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
FROM QUERY_FOR_V2_FCT_CLM t1
WHERE exists (select 'x' from WORK.ICD10_2 t2 where t1.CLM_PRCDR_1_CD = t2.F1
OR t1.CLM_PRCDR_2_CD = t2.F1);
quit;
Hope this helps
See this alternative:
data want;
set QUERY_FOR_V2_FCT_CLM end=done;
if _n_ = 1
then do;
length f1 /* define here, as it is in icd10 */;
declare hash icd10 (dataset:"icd10");
Idc10.definekey("f1");
icd10.definedone();
call missing(f1);
end;
if icd10.check(key:CLM_PRCDR_1_CD) = 0 or icd10.check(key:CLM_PRCDR_2_CD) = 0 then COUNT_of_BENE_SK + 1;
if done;
keep COUNT_of_BENE_SK;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.