BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
PrinceAde
Obsidian | Level 7

Hello there, 

Data three;
input x a$;
datalines;
1 a1
1 a2
2 b1
2 b2
4 d
;
run;


Data four;
input x b$;
datalines;
2 x1
2 x2
3 y
5 v
;
run;


proc sql;
title 'Table Merged';
select coalesce(three.x, four.x)
as x, a, b
from three
full join
four
on three.x = four.x;
quit;

table1
x     a      b

1     a2    

 

data merged;
merge three four;
by x;
run;

table2;

x    a    b

1    a1    

 

I don't understand While the proc sq join starts with table1 instead of table2.

Please clariification?

Thanks

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

With some SQL queries involving the same data set you can get different output order. SQL is designed for SET operations not row -by-row operations. So if you do not specify something to control the order it can be problematic.

 

And see if you get the order I do by changing the Select to a Select Distinct:

proc sql;
title 'Table Merged';
select distinct coalesce(three.x, four.x)
as x, a, b
from three
full join
four
on three.x = four.x;
quit;

Which gives me an order of:

x a b
1 a1  
1 a2  
2 b1 x1
2 b1 x2
2 b2 x1
2 b2 x2
3   y
4 d  
5   v

 

 

Merge however does process row-by-row.

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I don't understand the question.

--
Paige Miller
ballardw
Super User

Please reread your posted code very carefully. Then rephrase you question in terms of the code shown.

Your code does not mention "table1" or "table2" anywhere. It shows something that would make you think that those "tables" were created but if you run that code you will see all sorts of errors because where Table1 and Table2 appear they are not valid as code.

 

My best guess is that you are looking at a miss-matched "code" with "expected results" that do not use the table names with "example" results because of a cut-and-paste error.

 


@PrinceAde wrote:

Hello there, 

Data three;
input x a$;
datalines;
1 a1
1 a2
2 b1
2 b2
4 d
;
run;


Data four;
input x b$;
datalines;
2 x1
2 x2
3 y
5 v
;
run;


proc sql;
title 'Table Merged';
select coalesce(three.x, four.x)
as x, a, b
from three
full join
four
on three.x = four.x;
quit;

table1
x     a      b

1     a2    

 

data merged;
merge three four;
by x;
run;

table2;

x    a    b

1    a1    

 

I don't understand While the proc sq join starts with table1 instead of table2.

Please clariification?

Thanks

 

 


 

PrinceAde
Obsidian | Level 7

Hi @ballardw  @PaigeMiller 

 

I'm sorry for the confusion. If you check the result of the match merging, you will notice that the first observation is 

x    a    b

1    a1

 

while for the proc sql full join it is

x    a    b

1    a2

I'm not sure why the sql join jumped the first observation 1  a1 in table three and decide to start from 1  a2

 

 

Data three;
input x a$;
datalines;
1 a1
1 a2
2 b1
2 b2
4 d
;
run;


Data four;
input x b$;
datalines;
2 x1
2 x2
3 y
5 v
;
run;

proc sql;
title 'Table Merged';
select coalesce(three.x, four.x)
as x, a, b
from three
full join
four
on three.x = four.x;
quit;

 

data merged;
merge three four;
by x;
run;

 

 

 

PaigeMiller
Diamond | Level 26

In SQL, the order of the output can be changed by using ORDER BY in your SQL statements.

 

Also please note that although SQL and MERGE in a DATA step do very similar things, they are not identical and in some cases will not produce the same results.

 

 

--
Paige Miller
PrinceAde
Obsidian | Level 7

@PaigeMiller 

Thank you, sir. I understand this; I just want to know if there is any rationale for the Proc SQL join behaving this way.

PaigeMiller
Diamond | Level 26

@PrinceAde wrote:

@PaigeMiller 

Thank you, sir. I understand this; I just want to know if there is any rationale for the SQL behaving this way.


The default ordering of records in an SQL output is determined by the internal SQL algorithm. I am not aware of any explanation of what that internal SQL algorithm does, and its kind of irrelevant anyway as you can change the ordering of the records to whatever you want.

--
Paige Miller
ballardw
Super User

With some SQL queries involving the same data set you can get different output order. SQL is designed for SET operations not row -by-row operations. So if you do not specify something to control the order it can be problematic.

 

And see if you get the order I do by changing the Select to a Select Distinct:

proc sql;
title 'Table Merged';
select distinct coalesce(three.x, four.x)
as x, a, b
from three
full join
four
on three.x = four.x;
quit;

Which gives me an order of:

x a b
1 a1  
1 a2  
2 b1 x1
2 b1 x2
2 b2 x1
2 b2 x2
3   y
4 d  
5   v

 

 

Merge however does process row-by-row.

Tom
Super User Tom
Super User

Nothing has been "skipped".  All observations are accounted for.

 

The SQL full join produces these records. 

Tom_0-1718127416536.png

So you can see both the observations with X=1 are there.

 

The data step merge does a completely different operation and produces these results:

Tom_1-1718127500248.png

Notice the big difference in the way that the multiple observations with X=2 are handled.  In SQL a full cartesian product is produced.  So 2 observations from TABLE3 and 2 observations from TABLE4 generate 2x2=4 observations output.  In general for N observations from one dataset and M observations from the other then the full join will produce NXM observations.

 

In the DATA step the observations a matched in the order they appear.  So the 2 observations from TABLE3 are matched with the 2 observations from TABLE4 and the result is 2 observations.  In general for N observations from one dataset and M observations from the other the merge will result in MAX(N,M) observations.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 730 views
  • 0 likes
  • 4 in conversation