@pavank A data step MERGE differs from a PROC SQL in: (1) The sort order of the key variables When using a SQL join the observations in either data set do not have to be sorted, a DATA step MERGE requires the key variables (BY variables) to have been sorted in a step before (2) Cartesian products All joins are Cartesian products made on specified key variables. In left join , where there are multiple key values, a complete Cartesian product is created. With merge, the input observations are joined in sequence where they match. 3) Alias in proc sql you can use alias name for table to to identify variables non-key, in DATA Step MERGE can be used with a logical IN variable to identify which data set contributed the key values, but the variable names non key, must be unique to each data set since their source data set cannot be identified with an alias. (in your example if you try to generate a table with left join, without rename variables, you take a warning message for Variable Name and ID too) proc sql number;
create table leftjoin2 as
select * from exp1 a
left join exp2 b
on a.id=b.id;
quit; WARNING: Variable ID already exists on file WORK.LEFTJOIN2. WARNING: Variable Name already exists on file WORK.LEFTJOIN2. proc sql number;
create table leftjoin2 as
select a.*,b.Sex
from exp1 a
left join exp2 b
on a.id=b.id;
quit; 4) Runtime PROC SQL join is faster than a DATA Step MERGE; you can use indexex in both to improve performance I recommend you read this article! https://support.sas.com/resources/papers/proceedings/proceedings/sugi25/25/cc/25p109.pdf The paper explain the principles difference between proc sql and merge.
... View more