<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Left join doubling rows in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Left-join-doubling-rows/m-p/827116#M35326</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am trying to create a table using proc sql and I have a problem. Some but not all rows are getting doubled in the process. The code looks as follows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;BR /&gt;CREATE TABLE WORK.STEP AS&lt;BR /&gt;SELECT&lt;BR /&gt;t1.group,&lt;BR /&gt;t1.time,&lt;/P&gt;&lt;P&gt;t1.profit&lt;BR /&gt;FROM WORK.INPUT_FILE t1&lt;BR /&gt;LEFT JOIN WORK.GROUPS t2 on (t2.g0="*" OR (find(t1.group, strip(t2.g0) ) AND (Missing(t2.g1) OR find(t1.group, strip(t2.g1)))&lt;BR /&gt;AND (Missing(t2.g2) OR find(t1.group, strip(t2.g2)))));&lt;BR /&gt;QUIT;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Main thing here is that t1.group variable has a lot of different values and I wanna use the summary procedure later to aggregate some of the profit numbers. So, the groups I want to use are stored in three variables in table two (g0, g1, g2). But, for some groups it does this weird thing, that this sql procedure duplicates rows for specific groups. Not all of them, just a few, that have nothing particular in common. Does anyone know, why this might be happening?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;Bajtan&lt;/P&gt;</description>
    <pubDate>Thu, 04 Aug 2022 13:21:05 GMT</pubDate>
    <dc:creator>bajtan</dc:creator>
    <dc:date>2022-08-04T13:21:05Z</dc:date>
    <item>
      <title>Left join doubling rows</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Left-join-doubling-rows/m-p/827116#M35326</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am trying to create a table using proc sql and I have a problem. Some but not all rows are getting doubled in the process. The code looks as follows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;BR /&gt;CREATE TABLE WORK.STEP AS&lt;BR /&gt;SELECT&lt;BR /&gt;t1.group,&lt;BR /&gt;t1.time,&lt;/P&gt;&lt;P&gt;t1.profit&lt;BR /&gt;FROM WORK.INPUT_FILE t1&lt;BR /&gt;LEFT JOIN WORK.GROUPS t2 on (t2.g0="*" OR (find(t1.group, strip(t2.g0) ) AND (Missing(t2.g1) OR find(t1.group, strip(t2.g1)))&lt;BR /&gt;AND (Missing(t2.g2) OR find(t1.group, strip(t2.g2)))));&lt;BR /&gt;QUIT;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Main thing here is that t1.group variable has a lot of different values and I wanna use the summary procedure later to aggregate some of the profit numbers. So, the groups I want to use are stored in three variables in table two (g0, g1, g2). But, for some groups it does this weird thing, that this sql procedure duplicates rows for specific groups. Not all of them, just a few, that have nothing particular in common. Does anyone know, why this might be happening?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;Bajtan&lt;/P&gt;</description>
      <pubDate>Thu, 04 Aug 2022 13:21:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Left-join-doubling-rows/m-p/827116#M35326</guid>
      <dc:creator>bajtan</dc:creator>
      <dc:date>2022-08-04T13:21:05Z</dc:date>
    </item>
    <item>
      <title>Re: Left join doubling rows</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Left-join-doubling-rows/m-p/827120#M35327</link>
      <description>&lt;P&gt;An SQL join will create all combinations of matching records.&amp;nbsp; &amp;nbsp;So if one dataset contributes N observations and the other contributes M observations that match those N observations the result is N x M observations.&amp;nbsp; So assuming that there are no duplicates in the input "T1' dataset then more than one observation from "T2" must meet your ON condition for some of the observations from "T1".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For that query since you are only selecting variables from the T1 alias why not just add the DISTINCT keyword?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;CREATE TABLE WORK.STEP AS
SELECT  distinct
 t1.group
,t1.time
...&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 04 Aug 2022 13:44:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Left-join-doubling-rows/m-p/827120#M35327</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-08-04T13:44:43Z</dc:date>
    </item>
    <item>
      <title>Re: Left join doubling rows</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Left-join-doubling-rows/m-p/828825#M35462</link>
      <description>&lt;P&gt;I tried to use the&amp;nbsp;&lt;EM&gt;distinct&lt;/EM&gt; keyword and it looked like it helped. But once I got to a table that had more than one-part-group (t2.g1 wasn't missing), the problem was back. There is probably a problem with the condition behind &lt;EM&gt;on&lt;/EM&gt; keyword, but I can't figure it out. Any ideas?&lt;/P&gt;</description>
      <pubDate>Tue, 16 Aug 2022 12:17:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Left-join-doubling-rows/m-p/828825#M35462</guid>
      <dc:creator>bajtan</dc:creator>
      <dc:date>2022-08-16T12:17:35Z</dc:date>
    </item>
  </channel>
</rss>

