I am trying to merge two dataset with multiple conditions, in proc sql but when both conditions fail it is giving me a cartisian product of the first variables in dataset with all other variables in second dataset. if any of the below two conditions doesnt match then i dont want to have cartesian product for that. i am getting a cartesian product with all the non matching record in first dataset getting merged with all the records for that subject in second dataset. I dont want that to happen is there any better to merge.
proc sql;
create table pk_st_atr1 as
select st_atrx.*,pk_1x.adt,atm,pkstresc,pkstresu, case
when adt^= . then 'Y' end as pk_tr_fl
from st_atrx as X full join pk_1x as Y
on x.logdt ^= . and y.adt ^= . and x.logtm ^= . and y.atm ^= . and
(x.subject=y.subject and x.visitnum=y.visitnum and x.pktpt_std= y.pktpt_std and x.accession_number= y.accession_number and x.sample_id= y.sample_id)
or (x.subject=y.subject and x.visitnum=y.visitnum and x.accession_number= y.accession_number and x.sample_id= y.sample_id and x.logdt=y.adt and x.logtm=y.atm )
;
quit;
run;
Your code is a mess.
Write legible code and the issue with your OR clause will become obvious.
You might something like:
proc sql;
create table PK_ST_ATR1 as
select ST_ATRX.*
,PK_1X.ADT
,ATM
,PKSTRESC
,PKSTRESU
,case when ADT^= . then 'Y' end as PK_TR_FL
from ST_ATRX as X
full join
PK_1X as Y
on x.LOGDT ^= .
and y.ADT ^= .
and x.LOGTM ^= .
and y.ATM ^= .
and x.SUBJECT = y.SUBJECT
and x.VISITNUM = y.VISITNUM
and x.ACCESSION_NUMBER= y.ACCESSION_NUMBER
and x.SAMPLE_ID = y.SAMPLE_ID
and ( (x.PKTPT_STD = y.PKTPT_STD )
or(x.LOGDT=y.ADT and x.LOGTM=y.ATM ) ) ;
Code should be pretty. This will make your life much easier.
Also, verify if:
1. Do you really want a full join?
2. Should the test on missing values be in a WHERE clause?
Your code is a mess.
Write legible code and the issue with your OR clause will become obvious.
You might something like:
proc sql;
create table PK_ST_ATR1 as
select ST_ATRX.*
,PK_1X.ADT
,ATM
,PKSTRESC
,PKSTRESU
,case when ADT^= . then 'Y' end as PK_TR_FL
from ST_ATRX as X
full join
PK_1X as Y
on x.LOGDT ^= .
and y.ADT ^= .
and x.LOGTM ^= .
and y.ATM ^= .
and x.SUBJECT = y.SUBJECT
and x.VISITNUM = y.VISITNUM
and x.ACCESSION_NUMBER= y.ACCESSION_NUMBER
and x.SAMPLE_ID = y.SAMPLE_ID
and ( (x.PKTPT_STD = y.PKTPT_STD )
or(x.LOGDT=y.ADT and x.LOGTM=y.ATM ) ) ;
Code should be pretty. This will make your life much easier.
Also, verify if:
1. Do you really want a full join?
2. Should the test on missing values be in a WHERE clause?
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.