I need to run this merge in sql and I'm getting a cartesian product join. I took the step of eliminating duplicates from one of the joined datasets, but I still getting the message on the cartesian product. Any suggestions?
27
28
29 PROC SQL;
30 CREATE TABLE ICD10_2 AS
31 SELECT DISTINCT F1
32 FROM ICD10;
NOTE: Table WORK.ICD10_2 created, with 265 rows and 1 columns.
33 /* CREATE TABLE WORK.QUERY_FOR_V2_FCT_CLM AS */
34 SELECT /* COUNT_of_BENE_SK */
35 (COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
36 FROM QUERY_FOR_V2_FCT_CLM t1, WORK.ICD10_2 t2
37 WHERE (t1.CLM_PRCDR_1_CD = t2.F1
38 OR t1.CLM_PRCDR_2_CD = t2.F1);
NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.
39 QUIT;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.13 seconds
cpu time 0.10 seconds
I assume you just want the number of valid BENE_SK values from QUERY_FOR_V2_FCT_CLM - so it may help to rewrite the join a simple WHERE clause:
SELECT /* COUNT_of_BENE_SK */
(COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
FROM QUERY_FOR_V2_FCT_CLM t1
WHERE CLM_PRCDR_1_CD in (select F1 from WORK.ICD10_2)
or CLM_PRCDR_2_CD in(select F1 from WORK.ICD10_2);
I assume you just want the number of valid BENE_SK values from QUERY_FOR_V2_FCT_CLM - so it may help to rewrite the join a simple WHERE clause:
SELECT /* COUNT_of_BENE_SK */
(COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
FROM QUERY_FOR_V2_FCT_CLM t1
WHERE CLM_PRCDR_1_CD in (select F1 from WORK.ICD10_2)
or CLM_PRCDR_2_CD in(select F1 from WORK.ICD10_2);
Hi @Batman
Try this query, it uses Boolean expression
PROC SQL;
CREATE TABLE ICD10_2 AS
SELECT DISTINCT F1
FROM ICD10;
/* CREATE TABLE WORK.QUERY_FOR_V2_FCT_CLM AS */
SELECT /* COUNT_of_BENE_SK */
(COUNT(t1.BENE_SK)) AS COUNT_of_BENE_SK
FROM QUERY_FOR_V2_FCT_CLM t1
WHERE exists (select 'x' from WORK.ICD10_2 t2 where t1.CLM_PRCDR_1_CD = t2.F1
OR t1.CLM_PRCDR_2_CD = t2.F1);
quit;
Hope this helps
See this alternative:
data want;
set QUERY_FOR_V2_FCT_CLM end=done;
if _n_ = 1
then do;
length f1 /* define here, as it is in icd10 */;
declare hash icd10 (dataset:"icd10");
Idc10.definekey("f1");
icd10.definedone();
call missing(f1);
end;
if icd10.check(key:CLM_PRCDR_1_CD) = 0 or icd10.check(key:CLM_PRCDR_2_CD) = 0 then COUNT_of_BENE_SK + 1;
if done;
keep COUNT_of_BENE_SK;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.