<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: proc sql join multiple datasets in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925449#M364203</link>
    <description>&lt;P&gt;A very important question to ask is are the values of ID duplicated in any of the sets? If so, how do you want the result to appear?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your data is sorted (as implied by the example data sets) by ID and none of the ID are duplicated within a single set then a Data step merge is going to be way easier:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data want;
   merge dat1 dat2 dat3;
   by id;
run;&lt;/PRE&gt;
&lt;P&gt;The question about duplicates is very important because SQL joins are likely to result in multiple output rows.&lt;/P&gt;
&lt;P&gt;An example&lt;/P&gt;
&lt;PRE&gt;data dat1;
input id x1;
cards;
1 1
2 0
2 1
;
run;

data dat2;
input id x2;
cards;
2 1 
2 3 
5 0 
;
run;

proc sql;
   create table example as
   select a.id,a.x1,b.x2
   from dat1 as a
        left join 
        dat2 as b
        on a.id=b.id
   ;
quit;&lt;/PRE&gt;
&lt;P&gt;Note that the two values from Id=2 are combined with both Id=2 in the other set resulting in 4 id=2 observations (and yes, this example does get the Id=5 from set 2 as that is another headache in SQL)&lt;/P&gt;</description>
    <pubDate>Tue, 23 Apr 2024 20:07:03 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2024-04-23T20:07:03Z</dc:date>
    <item>
      <title>proc sql join multiple datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925439#M364197</link>
      <description>&lt;P&gt;I have several datasets that I want to use proc sql to join. but here to simplify, I use 3 datasets as example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data dat1;
input id x1;
cards;
1 1
2 0
3 0
;
run;

data dat2;
input id x2;
cards;
2 1 
3 1 
5 0 
;
run;

data dat3;
input id x3;
cards;
3 0 
4 1 
6 0 
;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I want to merge them based one id, with one id per row, the output should be something like this:&lt;/P&gt;
&lt;TABLE border="1" width="100%"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="25%" height="30px"&gt;id&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;x1&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;x2&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;x3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="30px"&gt;1&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;1&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="30px"&gt;2&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;0&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;1&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="30px"&gt;3&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;0&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;1&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;0&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="30px"&gt;4&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;1&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="30px"&gt;5&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;0&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="30px"&gt;6&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;.&lt;/TD&gt;
&lt;TD width="25%" height="30px"&gt;0&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;</description>
      <pubDate>Tue, 23 Apr 2024 19:29:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925439#M364197</guid>
      <dc:creator>zihdonv19</dc:creator>
      <dc:date>2024-04-23T19:29:47Z</dc:date>
    </item>
    <item>
      <title>Re: proc sql join multiple datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925446#M364202</link>
      <description>&lt;P&gt;Your data is already sorted by ID, so this will do the trick:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
	merge dat1 dat2 dat3;
	by id;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 23 Apr 2024 19:58:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925446#M364202</guid>
      <dc:creator>SASJedi</dc:creator>
      <dc:date>2024-04-23T19:58:19Z</dc:date>
    </item>
    <item>
      <title>Re: proc sql join multiple datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925449#M364203</link>
      <description>&lt;P&gt;A very important question to ask is are the values of ID duplicated in any of the sets? If so, how do you want the result to appear?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your data is sorted (as implied by the example data sets) by ID and none of the ID are duplicated within a single set then a Data step merge is going to be way easier:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data want;
   merge dat1 dat2 dat3;
   by id;
run;&lt;/PRE&gt;
&lt;P&gt;The question about duplicates is very important because SQL joins are likely to result in multiple output rows.&lt;/P&gt;
&lt;P&gt;An example&lt;/P&gt;
&lt;PRE&gt;data dat1;
input id x1;
cards;
1 1
2 0
2 1
;
run;

data dat2;
input id x2;
cards;
2 1 
2 3 
5 0 
;
run;

proc sql;
   create table example as
   select a.id,a.x1,b.x2
   from dat1 as a
        left join 
        dat2 as b
        on a.id=b.id
   ;
quit;&lt;/PRE&gt;
&lt;P&gt;Note that the two values from Id=2 are combined with both Id=2 in the other set resulting in 4 id=2 observations (and yes, this example does get the Id=5 from set 2 as that is another headache in SQL)&lt;/P&gt;</description>
      <pubDate>Tue, 23 Apr 2024 20:07:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925449#M364203</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2024-04-23T20:07:03Z</dc:date>
    </item>
    <item>
      <title>Re: proc sql join multiple datasets</title>
      <link>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925450#M364204</link>
      <description>&lt;P&gt;Why would you use SQL for that?&amp;nbsp; It is trivial with normal SAS code.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  merge dat1-dat3;
  by id;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Result:&lt;/P&gt;
&lt;PRE&gt;Obs    id    x1    x2    x3

 1      1     1     .     .
 2      2     0     1     .
 3      3     0     1     0
 4      4     .     .     1
 5      5     .     0     .
 6      6     .     .     0

&lt;/PRE&gt;
&lt;P&gt;If you did need to do it in PROC SQL code then the easiest is to use NATURAL joins.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table want as
select * 
from dat2 
natural full join dat3
natural full join dat1 
order by id
;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;(I referenced the datasets in that strange order so that X1 to X3 get added to the dataset in a nicer order so I could use * as the variable list.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you don't use NATURAL joins you will have to use the COALESCE() function to get the resulting ID variable and spell out the join criteria.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table want as
select coalesce(a.id,b.id,c.id) as id,x1,x2,x3 
from dat1 a
full join dat2 b on a.id=b.id
full join dat3 c on a.id=c.id or b.id=c.id 
order by 1
;
quit;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 23 Apr 2024 20:10:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/proc-sql-join-multiple-datasets/m-p/925450#M364204</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2024-04-23T20:10:37Z</dc:date>
    </item>
  </channel>
</rss>

