<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Data Step instead of Left Join? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452039#M114044</link>
    <description>&lt;P&gt;Hey guys,&amp;nbsp;&lt;/P&gt;&lt;P&gt;Im a student from Germany (so sorry for my bad english) and Im having a Code with, containing a SQL step with Inner Join, but the program&amp;nbsp;runs tooooo slowly. I guess there could be a faster version (e.g. with a data step) for the following code:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table work.Result as select distinct a.*, b.Value from&amp;nbsp;&lt;/P&gt;&lt;P&gt;work.Table1 As a left join work.Table2 As B on (a.id = b.id and a.date = b.date);&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The problem is that my program&amp;nbsp;contains a macro and the code runs very often, so every second counts &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I´d appreciate it so much if you could help me with that.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;Alex&lt;/P&gt;</description>
    <pubDate>Fri, 06 Apr 2018 18:22:16 GMT</pubDate>
    <dc:creator>mrzlatan91</dc:creator>
    <dc:date>2018-04-06T18:22:16Z</dc:date>
    <item>
      <title>Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452039#M114044</link>
      <description>&lt;P&gt;Hey guys,&amp;nbsp;&lt;/P&gt;&lt;P&gt;Im a student from Germany (so sorry for my bad english) and Im having a Code with, containing a SQL step with Inner Join, but the program&amp;nbsp;runs tooooo slowly. I guess there could be a faster version (e.g. with a data step) for the following code:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sql;&lt;/P&gt;&lt;P&gt;create table work.Result as select distinct a.*, b.Value from&amp;nbsp;&lt;/P&gt;&lt;P&gt;work.Table1 As a left join work.Table2 As B on (a.id = b.id and a.date = b.date);&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The problem is that my program&amp;nbsp;contains a macro and the code runs very often, so every second counts &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I´d appreciate it so much if you could help me with that.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance,&lt;/P&gt;&lt;P&gt;Alex&lt;/P&gt;</description>
      <pubDate>Fri, 06 Apr 2018 18:22:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452039#M114044</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-06T18:22:16Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452052#M114045</link>
      <description>&lt;P&gt;1) The macro has no effect on run time.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;2) If the relation between the 2 tables is many to many,&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; then SQL is almost the only relevant to be used.&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; Alternative way is using hash method - keeping one of the tables in memeory.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp;if the relation is 1:N (or N:1)&amp;nbsp; it is better to sort both tables and MERGE them in a sas datastep.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;3) Long time may be the result of big tables with a lot of variables.&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; If you don't need all variables keep the relevant ones only.&lt;BR /&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Apr 2018 18:57:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452052#M114045</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2018-04-06T18:57:09Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452075#M114055</link>
      <description>&lt;P&gt;Many ways ... a hash table would be faster, a format would be faster, it's even possible that sort+merge would be faster.&amp;nbsp; Here's a key related question that influences whether these other approaches would be suitable:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Does Table2 contain just a single observation for each combination of ID and DATE?&lt;/P&gt;</description>
      <pubDate>Fri, 06 Apr 2018 19:29:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452075#M114055</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-04-06T19:29:44Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452110#M114066</link>
      <description>&lt;P&gt;Also decide if the distinct is really needed. It necessitates a sort by&amp;nbsp;&lt;EM&gt;all&lt;/EM&gt; variables contained in the output.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Apr 2018 21:10:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452110#M114066</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-04-06T21:10:05Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452122#M114073</link>
      <description>&lt;P&gt;Wow, that sounds good.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Table 1 contains a lot of observations, table 2 has always a part of it, lets say less than 1% (it depends on the iteration)&lt;/P&gt;&lt;P&gt;Unfortunately, I don´t know how to use the Hash Function to solve this problem.&lt;/P&gt;&lt;P&gt;Could anyone provide me a solution for my posted problem?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Apr 2018 22:06:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452122#M114073</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-06T22:06:19Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452135#M114078</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/186452"&gt;@mrzlatan91&lt;/a&gt; wrote:&lt;BR /&gt;
&lt;P&gt;Wow, that sounds good.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Table 1 contains a lot of observations, table 2 has always a part of it, lets say less than 1% (it depends on the iteration)&lt;/P&gt;
&lt;P&gt;Unfortunately, I don´t know how to use the Hash Function to solve this problem.&lt;/P&gt;
&lt;P&gt;Could anyone provide me a solution for my posted problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Provide some example data in the form of data steps if you want tested code.&lt;/P&gt;
&lt;P&gt;Instructions here: &lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-data-AKA-generate/ta-p/258712" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-data-AKA-generate/ta-p/258712&lt;/A&gt; will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I realize that small data sets won't have the same time of execution issue but code suggestions have a much better chance of actually working if provided. Any sensitive data values should be masked to protect the guilty.&lt;/P&gt;</description>
      <pubDate>Fri, 06 Apr 2018 23:07:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452135#M114078</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2018-04-06T23:07:11Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452168#M114086</link>
      <description>&lt;P&gt;Sorry, I should have done this before. I´ve made a simplified version, because the size of my dataset is around 40GB.&lt;/P&gt;&lt;P&gt;Both Tables have the same structure, but Table 1 is much bigger than Table 2.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data table1;&lt;BR /&gt;input date $ value $ id $ fundid $;&lt;BR /&gt;datalines;&lt;BR /&gt;01OCT18 50 1234 10000&lt;BR /&gt;01OCT18 52 1234 10001&lt;BR /&gt;01OCT19 50 1234 10000&lt;BR /&gt;01OCT19 53 1234 10001&lt;BR /&gt;01OCT19 54 1234 10002&lt;BR /&gt;01NOV19 60 1236 10000&lt;BR /&gt;01NOV19 61 1236 10001&lt;BR /&gt;01NOV19 62 1236 10002&lt;BR /&gt;01OCT19 54 1234 10007&lt;BR /&gt;01NOV19 55 1236 10008&lt;BR /&gt;01NOV19 35 1236 10009&lt;BR /&gt;01NOV19 66 1236 10010&lt;BR /&gt;01DEC19 40 1236 10008&lt;BR /&gt;01DEC19 42 1236 10009&lt;BR /&gt;01DEC19 53 1236 10010&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;data table2;&lt;BR /&gt;input date $ value $ id $ fundid $;&lt;BR /&gt;datalines;&lt;BR /&gt;01OCT18 50 1234 10000&lt;BR /&gt;01OCT18 52 1234 10000&lt;BR /&gt;01OCT19 50 1234 10000&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt;create table work.result as select distinct a.*, b.value as newValue from&lt;BR /&gt;work.table1 as a left join work.table2 as b on (a.id=b.id and a.date = b.date);&lt;BR /&gt;quit;&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 07:38:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452168#M114086</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-07T07:38:43Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452170#M114087</link>
      <description>&lt;P&gt;From your example data, I can see no double entries for combinations of id and date. If this is true for the whole dataset, you can use a sort + data step merge. And the distinct in the SQL would not be needed.&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 08:58:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452170#M114087</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-04-07T08:58:01Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452171#M114088</link>
      <description>&lt;P&gt;The next question would be the size of table2. If sizeof(id + date + value) * nobs can fit into memory, creating a format or loading into a hash table would make sorting unnecessary.&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 09:07:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452171#M114088</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-04-07T09:07:35Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452184#M114092</link>
      <description>&lt;P&gt;It depends on the Iteration, because I analyze every fund in my dataset.&lt;/P&gt;&lt;P&gt;Some funds have 100,000 Observations, some funds only 100.&lt;/P&gt;&lt;P&gt;Could you provide me a Hash-Solution?&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 11:59:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452184#M114092</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-07T11:59:34Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452185#M114093</link>
      <description>&lt;P&gt;Well I guess it´s not necessary.&lt;/P&gt;&lt;P&gt;I coded this part 2 months ago and I don´t know why I did this to be honest.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 12:00:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452185#M114093</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-07T12:00:51Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452187#M114094</link>
      <description>&lt;P&gt;Oh **bleep**, I think this was the reason why I´ve made the distinct clause.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What would be the solution if the distinct part is needed? Hash?&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 12:06:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452187#M114094</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-07T12:06:17Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452194#M114095</link>
      <description>&lt;P&gt;Take a look at the data you posted.&amp;nbsp; Both data sets contain 2 observations for this combination:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ID=1234&amp;nbsp;&lt;/P&gt;
&lt;P&gt;date=01OCT2018&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Looking at the data, how do you know which observations to match?&amp;nbsp; How do you know which value to use from Table 2?&amp;nbsp; Do you need to match on not just ID and DATE, but also on FUNDID?&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 13:12:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452194#M114095</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-04-07T13:12:08Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452198#M114096</link>
      <description>&lt;P&gt;Oh sorry,&lt;/P&gt;&lt;P&gt;this was a mistake. Every Date/Value Combination of Table 2 is Unique.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think I should explain it a little bit. Table 1 contains Valuations (Value) for Stocks/Bonds (IDs), that are made by some Funds (Fundid) in a specific month (date).&lt;/P&gt;&lt;P&gt;Table 2 contains Valuations for Stocks of ONE specific Fund, so every Date/Value Combination is unique.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 13:31:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452198#M114096</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-07T13:31:09Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452206#M114098</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/186452"&gt;@mrzlatan91&lt;/a&gt;, I'm quoting few of your statemwnts:&lt;/P&gt;
&lt;PRE&gt;Table 1 contains a lot of observations, table 2 has always a part of it, lets say less than 1% &lt;/PRE&gt;
&lt;PRE&gt; Every Date/Value Combination of Table 2 is Unique.&lt;/PRE&gt;
&lt;P&gt;Try next solution:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=table1 out=temp1 nodupkey;
    by date id;
run;
/* if table1 is already sorted then replace above sort with:
       data temp1;
         set table1;
              by date value;
                   if last.value;  /* output distinct values */
      run;
******************************/
proc sort data=table2; by date id; run; /* verify that table is sorted */

data want;
merge table1(in=in1)
           table2(keep=value rename=(value=newvalue) in=in2);
  by date id;
       if in1;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 07 Apr 2018 14:46:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452206#M114098</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2018-04-07T14:46:25Z</dc:date>
    </item>
    <item>
      <title>Re: Data Step instead of Left Join?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452219#M114103</link>
      <description>&lt;P&gt;Thank you very much!&lt;/P&gt;</description>
      <pubDate>Sat, 07 Apr 2018 18:45:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Data-Step-instead-of-Left-Join/m-p/452219#M114103</guid>
      <dc:creator>mrzlatan91</dc:creator>
      <dc:date>2018-04-07T18:45:10Z</dc:date>
    </item>
  </channel>
</rss>

