I have two data sets.
ID Name Height
1 A 1
3 B 2
5 C 2
7 D 2
9 E 2
ID Name Height
2 A 2
4 B 3
5 C 4
7 D 5
I am merging them using MERGE and SQL Left Join. MERGE returns correct matching value for same values in the common variable (Primary Key) , while SQL left join does not update values.
Data A;
Input ID Name$ Height;
cards;
1 A 1
3 B 2
5 C 2
7 D 2
9 E 2
;
run;
Data B;
Input ID Name$ Height;
cards;
2 A 2
4 B 3
5 C 4
7 D 5
;
run;
PROC SQL;
Create table dummy as
Select * from A as file1 left join B as file2
On file1.ID = file2.ID;
Quit;
PROC SQL returns :
ID Name Height
1 A 1
3 B 2
5 C 2
7 D 2
9 E 2
Data dummy;
merge A (IN=A) B (IN=B);
by ID;
If A;
run;
MERGE returns :
ID Name Height
1 A 1
3 B 2
5 C 4
7 D 5
9 E 2
Why PROC SQL LEFT JOIN does not update common values?
Hi Ujjawal,
This other similar question you asked on the community will help here: https://communities.sas.com/t5/SAS-Data-Management/PROC-SQL-Joins-on-3-Tables/m-p/196808#M4324
Here's the solution from SteveED on the above-mentioned thread:
@SASKiwi gives a good example of a 3 table join.
The example @Ujjawal gave though is for a match-merge which would generate (potentially) different results. HERE is a link to a SUGI 30 paper highlighting the differences (there are many other papers as well). There are reasons to use the different types of merge/joins depending on the need.
Steve
Anna
In your SQL you're joining 2 tables with common variable names (other than the one you join on). If you execute such code you will see below warnings in your code:
WARNING: Variable ID already exists on file WORK.DUMMY.
WARNING: Variable Name already exists on file WORK.DUMMY.
WARNING: Variable Height already exists on file WORK.DUMMY.
With the SAS SQL flavor a "select *" will take the variable from the first table and ignore the one from the second table (that's what the warning tells you) - other SQL flavors (eg. Oracle) would throw an error.
With a SAS Merge the last listed same named variable will overwrite the same named variable from a table listed earlier in the merge statement. That's why in this case you're getting the values from table 'B'.
There is difference in data step and sql processing. Data merging works sequentially checks each observation of each data set to match these by BY value to write it to combined dataset and same name variables in the first data set are overwritten with the liked named variable in subsequent data sets . SQL processing is based on cartesian product where processing moves horizontally and same name variables coming from different data sets can not be overwritten.
If you like SQL, try this one :
Data A; Input ID Name$ Height; cards; 1 A 1 3 B 2 5 C 2 7 D 2 9 E 2 ; run; Data B; Input ID Name$ Height; cards; 2 A 2 4 B 3 5 C 4 7 D 5 ; run; PROC SQL; Create table dummy as Select file1.ID,file1.Name,coalesce(file2.Height,file1.Height) as Height from A as file1 left join B as file2 On file1.ID = file2.ID and file1.name = file2.name; Quit;
Xia Keshan
Using PROC SQL, how do I overwrite the height value without creating a new variable and then dropping the existing and renaming the new to existing?
PROC SQL;
Create table dummy as
Select *
, case when NAME = 'A' then 5 else 6 end as height
from A as file1 left join B as file2 On file1.ID = file2.ID;
Quit;
Hi,
You should really create a new topic rather than replying to an old post. For your question, the reason you are getting a new variable at the end is because you are using the * notation. This selects all variables. If you write SQL correctly, then you would specify each variable in the select, where you want it to appear, its attributes, and what data to collect, e.g.
proc sql;
create table WANT as
select A.FIRST_VARIABLE as FIRST_VARIABLE,
B.NEW_VARIABLE as SECOND_VARIABLE,
case when A.TEST_VAR=1 then 'Found'
else 'Not Found' end as THIRD_VARIABLE length=20,
A.LAST_VAR as FOURTH_VARIABLE
from HAVE A
left join HAVE2 B
on A.VAR=B.VAR;
quit;
Yeah. Start a new session. and Post some sample data and the output you want , that would be better to explain your question.
Hi Ujjawal,
This other similar question you asked on the community will help here: https://communities.sas.com/t5/SAS-Data-Management/PROC-SQL-Joins-on-3-Tables/m-p/196808#M4324
Here's the solution from SteveED on the above-mentioned thread:
@SASKiwi gives a good example of a 3 table join.
The example @Ujjawal gave though is for a match-merge which would generate (potentially) different results. HERE is a link to a SUGI 30 paper highlighting the differences (there are many other papers as well). There are reasons to use the different types of merge/joins depending on the need.
Steve
Anna
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.