BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ujjawal
Quartz | Level 8

I have two data sets.

ID Name Height

1 A 1

3 B 2

5 C 2

7 D 2

9 E 2

ID Name Height

2 A 2

4 B 3

5 C 4

7 D 5

I am merging them using MERGE and SQL Left Join. MERGE returns correct matching value for same values in the common variable (Primary Key) , while SQL left join does not update values.

Data A;

Input ID Name$ Height;

cards;

1 A 1

3 B 2

5 C 2

7 D 2

9 E 2

;

run;

Data B;

Input ID Name$ Height;

cards;

2 A 2

4 B 3

5 C 4

7 D 5

;

run;

PROC SQL;

Create table dummy as

Select * from A as file1 left join B as file2

On file1.ID = file2.ID;

Quit;

PROC SQL returns :

ID Name Height

1 A 1

3 B 2

5 C 2

7 D 2

9 E 2

Data dummy;

merge A (IN=A) B (IN=B);

by ID;

If A;

run;

MERGE returns :

ID Name Height

1 A 1

3 B 2

5 C 4

7 D 5

9 E 2

Why PROC SQL LEFT JOIN does not update common values?

1 ACCEPTED SOLUTION

Accepted Solutions
AnnaBrown
Community Manager

Hi 

 

 

 

@SASKiwi gives a good example of a 3 table join. 

 

The example @Ujjawal gave though is for a match-merge which would generate (potentially) different results. HERE is a link to a SUGI 30 paper highlighting the differences (there are many other papers as well).  There are reasons to use the different types of merge/joins depending on the need. 

 

Steve

 


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

View solution in original post

7 REPLIES 7
Patrick
Opal | Level 21

In your SQL you're joining 2 tables with common variable names (other than the one you join on). If you execute such code you will see below warnings in your code:

WARNING: Variable ID already exists on file WORK.DUMMY.

WARNING: Variable Name already exists on file WORK.DUMMY.

WARNING: Variable Height already exists on file WORK.DUMMY.

With the SAS SQL flavor a "select *" will take the variable from the first table and ignore the one from the second table (that's what the warning tells you) - other SQL flavors (eg. Oracle) would throw an error.

With a SAS Merge the last listed same named variable will overwrite the same named variable from a table listed earlier in the merge statement. That's why in this case you're getting the values from table 'B'.

stat_sas
Ammonite | Level 13

There is difference in data step and sql processing. Data merging works sequentially checks each observation of each data set to match these by BY value to write it to combined dataset and same name variables in the first data set are overwritten with the liked named variable in subsequent data sets . SQL processing is based on cartesian product where processing moves horizontally and same name variables coming from different data sets can not be overwritten.

Ksharp
Super User

If you like SQL, try this one :

Data A;
Input ID Name$ Height;
cards;
1 A 1
3 B 2
5 C 2
7 D 2
9 E 2
;
run;
 
Data B;
Input ID Name$ Height;
cards;
2 A 2
4 B 3
5 C 4
7 D 5
;
run;

PROC SQL;
Create table dummy as
Select file1.ID,file1.Name,coalesce(file2.Height,file1.Height) as Height from A as file1 left join B as file2
On file1.ID = file2.ID and file1.name = file2.name;
Quit;

Xia Keshan

hellind
Quartz | Level 8

Using PROC SQL, how do I overwrite the height value without creating a new variable and then dropping the existing and renaming the new to existing?

PROC SQL;

Create table dummy as

Select *

     , case when NAME = 'A' then 5 else 6 end as height

from A as file1 left join B as file2 On file1.ID = file2.ID;

Quit;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

You should really create a new topic rather than replying to an old post.  For your question, the reason you are getting a new variable at the end is because you are using the * notation.  This selects all variables.  If you write SQL correctly, then you would specify each variable in the select, where you want it to appear, its attributes, and what data to collect, e.g.

proc sql;

     create table WANT as

     select     A.FIRST_VARIABLE as FIRST_VARIABLE,

                     B.NEW_VARIABLE as SECOND_VARIABLE,

                     case  when A.TEST_VAR=1 then 'Found'

                                else   'Not Found' end as THIRD_VARIABLE length=20,

                     A.LAST_VAR as FOURTH_VARIABLE

     from        HAVE A

     left join   HAVE2 B

     on           A.VAR=B.VAR;

quit;

Ksharp
Super User

Yeah. Start a new session. and Post some sample data and the output you want , that would be better to explain your question.

AnnaBrown
Community Manager

Hi 

 

 

 

@SASKiwi gives a good example of a 3 table join. 

 

The example @Ujjawal gave though is for a match-merge which would generate (potentially) different results. HERE is a link to a SUGI 30 paper highlighting the differences (there are many other papers as well).  There are reasons to use the different types of merge/joins depending on the need. 

 

Steve

 


Join us for SAS Community Trivia
SAS Bowl XXIX, The SAS Hackathon
Wednesday, March 8, 2023, at 10 AM ET | #SASBowl

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 22442 views
  • 2 likes
  • 7 in conversation