<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic DATA Step MERGE vs PROC SQL JOIN in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/907209#M40573</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The book on SAS said that when only some of the values of the BY variable match, a PROC SQL full join will produce the same result as a DATA step match-merge. Here is the example from the book:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data THREE;
  input X A $;
datalines;
X   A
1   a1
1   a2
2   b1
2   b2
4   d

data FOUR:
  input X B $;
datalines;
X   B
2   x1
2   x2
3   y
5   v

data merged;
  merge three four;
  by x;
run;

RESULT from Data step Merge:
X   A     B
1   a1    
1   a2
2   b1   x1
2   b2   x2
3        y
4   d
5        v

proc sql;
  select coalesce(three.x, four.x) as X, a, b
  from three full join four
  on three.x = four.x;
quit;

RESULT from proc sql join:
X   A   B
1   a2  
1   a1
2   b1   x1
2   b1   x2
2   b2   x1
2   b2   x2
3        y
4   d
5        v&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Clearly, the proc sql has 2 more rows than the data step merge. Is there something wrong with the code in the proc sql or the book itself was misleading by saying that the proc sql would produce the same result?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;</description>
    <pubDate>Sun, 10 Dec 2023 22:45:30 GMT</pubDate>
    <dc:creator>cosmid</dc:creator>
    <dc:date>2023-12-10T22:45:30Z</dc:date>
    <item>
      <title>DATA Step MERGE vs PROC SQL JOIN</title>
      <link>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/907209#M40573</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The book on SAS said that when only some of the values of the BY variable match, a PROC SQL full join will produce the same result as a DATA step match-merge. Here is the example from the book:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data THREE;
  input X A $;
datalines;
X   A
1   a1
1   a2
2   b1
2   b2
4   d

data FOUR:
  input X B $;
datalines;
X   B
2   x1
2   x2
3   y
5   v

data merged;
  merge three four;
  by x;
run;

RESULT from Data step Merge:
X   A     B
1   a1    
1   a2
2   b1   x1
2   b2   x2
3        y
4   d
5        v

proc sql;
  select coalesce(three.x, four.x) as X, a, b
  from three full join four
  on three.x = four.x;
quit;

RESULT from proc sql join:
X   A   B
1   a2  
1   a1
2   b1   x1
2   b1   x2
2   b2   x1
2   b2   x2
3        y
4   d
5        v&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Clearly, the proc sql has 2 more rows than the data step merge. Is there something wrong with the code in the proc sql or the book itself was misleading by saying that the proc sql would produce the same result?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Sun, 10 Dec 2023 22:45:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/907209#M40573</guid>
      <dc:creator>cosmid</dc:creator>
      <dc:date>2023-12-10T22:45:30Z</dc:date>
    </item>
    <item>
      <title>Re: DATA Step MERGE vs PROC SQL JOIN</title>
      <link>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/907211#M40574</link>
      <description>&lt;P&gt;What book? What exactly did it say?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If two or more contributing tables have two or more observations for the same BY group values then PROC SQL FULL JOIN will produce more observations. The SQL join will produce N*M observations for such a by group.&amp;nbsp; The data step merge will produce MAX(N,M) observations.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;They are only similar when only one of the contributing datasets has more than one observation in a BY group&lt;/P&gt;</description>
      <pubDate>Mon, 11 Dec 2023 00:36:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/907211#M40574</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2023-12-11T00:36:12Z</dc:date>
    </item>
    <item>
      <title>Re: DATA Step MERGE vs PROC SQL JOIN</title>
      <link>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/907212#M40575</link>
      <description>&lt;P&gt;"The book on SAS"?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Which of the many (over a hundred?) books on SAS that cover both match-merge and PROC SQL are you referring to?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Citation please, with the actual contents.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;PS:&amp;nbsp; There are certain instances in which this claim is true:&amp;nbsp; Namely if, no BY value requires a many-to-many join.&lt;/P&gt;
&lt;P&gt;Editted note: although order may very well be different.&lt;/P&gt;</description>
      <pubDate>Mon, 11 Dec 2023 00:38:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/907212#M40575</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2023-12-11T00:38:16Z</dc:date>
    </item>
    <item>
      <title>Re: DATA Step MERGE vs PROC SQL JOIN</title>
      <link>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/908522#M40637</link>
      <description>The book is SAS Certified Professional Prep Guide. &lt;BR /&gt;&lt;BR /&gt;I went back and read the section again and realized I made a mistake. It said:&lt;BR /&gt;When the COALESCE function is added to the preceding PROC SQL full outer join, the DATA step match-merge (with PROC PRINT step) and the PROC SQL full outer join combine rows in the same way. &lt;BR /&gt;&lt;BR /&gt;So, I mistakenly thought the output will be the same and now I realized they only COMBINE them in the same way.</description>
      <pubDate>Sun, 17 Dec 2023 21:20:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/DATA-Step-MERGE-vs-PROC-SQL-JOIN/m-p/908522#M40637</guid>
      <dc:creator>cosmid</dc:creator>
      <dc:date>2023-12-17T21:20:11Z</dc:date>
    </item>
  </channel>
</rss>

