Hi,
The book on SAS said that when only some of the values of the BY variable match, a PROC SQL full join will produce the same result as a DATA step match-merge. Here is the example from the book:
data THREE;
input X A $;
datalines;
X A
1 a1
1 a2
2 b1
2 b2
4 d
data FOUR:
input X B $;
datalines;
X B
2 x1
2 x2
3 y
5 v
data merged;
merge three four;
by x;
run;
RESULT from Data step Merge:
X A B
1 a1
1 a2
2 b1 x1
2 b2 x2
3 y
4 d
5 v
proc sql;
select coalesce(three.x, four.x) as X, a, b
from three full join four
on three.x = four.x;
quit;
RESULT from proc sql join:
X A B
1 a2
1 a1
2 b1 x1
2 b1 x2
2 b2 x1
2 b2 x2
3 y
4 d
5 v
Clearly, the proc sql has 2 more rows than the data step merge. Is there something wrong with the code in the proc sql or the book itself was misleading by saying that the proc sql would produce the same result?
Thanks
What book? What exactly did it say?
If two or more contributing tables have two or more observations for the same BY group values then PROC SQL FULL JOIN will produce more observations. The SQL join will produce N*M observations for such a by group. The data step merge will produce MAX(N,M) observations.
They are only similar when only one of the contributing datasets has more than one observation in a BY group
What book? What exactly did it say?
If two or more contributing tables have two or more observations for the same BY group values then PROC SQL FULL JOIN will produce more observations. The SQL join will produce N*M observations for such a by group. The data step merge will produce MAX(N,M) observations.
They are only similar when only one of the contributing datasets has more than one observation in a BY group
"The book on SAS"?
Which of the many (over a hundred?) books on SAS that cover both match-merge and PROC SQL are you referring to?
Citation please, with the actual contents.
PS: There are certain instances in which this claim is true: Namely if, no BY value requires a many-to-many join.
Editted note: although order may very well be different.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.