<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Need answers if you can in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142995#M38036</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;1)Large data set (20 mil records) and a second data set (10,000 records). Say we do an inner join with product ID. But this is too slow.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Any recommendation to improve performance?&lt;/P&gt;&lt;P&gt;2)I get the output data set from a current dataset A, product ID by the last occurrence, check if its excluded in a separate dataset B (say, discontinued products).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; How do you do it?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 14 Nov 2013 13:11:33 GMT</pubDate>
    <dc:creator>sahaji</dc:creator>
    <dc:date>2013-11-14T13:11:33Z</dc:date>
    <item>
      <title>Need answers if you can</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142995#M38036</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;1)Large data set (20 mil records) and a second data set (10,000 records). Say we do an inner join with product ID. But this is too slow.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Any recommendation to improve performance?&lt;/P&gt;&lt;P&gt;2)I get the output data set from a current dataset A, product ID by the last occurrence, check if its excluded in a separate dataset B (say, discontinued products).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; How do you do it?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 14 Nov 2013 13:11:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142995#M38036</guid>
      <dc:creator>sahaji</dc:creator>
      <dc:date>2013-11-14T13:11:33Z</dc:date>
    </item>
    <item>
      <title>Re: Need answers if you can</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142996#M38037</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hello Sahaji,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I will give it my best shot &lt;span class="lia-unicode-emoji" title=":winking_face:"&gt;😉&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;1)&lt;/P&gt;&lt;P&gt;In my opinion your questions depend on the infrastructure/hardware specs that you have.&lt;/P&gt;&lt;P&gt;There are several options for you to try:&lt;/P&gt;&lt;P&gt;- you could use the smaller data set as a hash file: &lt;A href="http://support.sas.com/rnd/base/datastep/dot/better-hashing-sas92.pdf" title="http://support.sas.com/rnd/base/datastep/dot/better-hashing-sas92.pdf"&gt;http://support.sas.com/rnd/base/datastep/dot/better-hashing-sas92.pdf&lt;/A&gt;&lt;/P&gt;&lt;P&gt;- you could create a format from the smaller data set and use it on the 20mio data set. Then when your format variable is not filled you would drop that row.&lt;/P&gt;&lt;P&gt;- if you use SPDS for your data sets you can safe a lot of time by using PROC SQL instead of DATA-Steps and make use of parallel processing (&lt;A href="http://support.sas.com/documentation/cdl/en/spdsug/64018/HTML/default/viewer.htm#p1f2t0e3f9mx45n1tj3nutnrt0hs.htm" title="http://support.sas.com/documentation/cdl/en/spdsug/64018/HTML/default/viewer.htm#p1f2t0e3f9mx45n1tj3nutnrt0hs.htm"&gt;SAS(R) Scalable Performance Data Server(R) 4.53: User's Guide&lt;/A&gt;). We used PROC SQL instead of DATA-Steps to save a lot of time in our programs.&lt;/P&gt;&lt;P&gt;- you could also try to create indexes on your join variables&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2)&lt;/P&gt;&lt;P&gt;Perhaps you could specify some details. Why would you want to check if your product ID is not within a table B?&lt;/P&gt;&lt;P&gt;Do you want to make sure that it is not in table B? Then I would suggest a select where the product Id is not in select product ID from table A.&lt;/P&gt;&lt;P&gt;But with more details we might find a better fitting solution.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Cheers,&lt;/P&gt;&lt;P&gt;Michael&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 14 Nov 2013 14:41:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142996#M38037</guid>
      <dc:creator>mfab</dc:creator>
      <dc:date>2013-11-14T14:41:34Z</dc:date>
    </item>
    <item>
      <title>Re: Need answers if you can</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142997#M38038</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank You so much Michael. Thanks for spending time to answer my question. Greatly appreciated.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 14 Nov 2013 15:14:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142997#M38038</guid>
      <dc:creator>sahaji</dc:creator>
      <dc:date>2013-11-14T15:14:22Z</dc:date>
    </item>
    <item>
      <title>Re: Need answers if you can</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142998#M38039</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Is the large dataset indexed or partitioned?&amp;nbsp;&amp;nbsp;&amp;nbsp; If it is, then you may be able to limit the data very early in the process.&lt;BR /&gt;Add a where clause that uses a variable that is indexed or partitioned.&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; You can add the where clause right on the&lt;/P&gt;&lt;P&gt;join, as in the example below.&lt;/P&gt;&lt;P&gt;Good luck.&lt;/P&gt;&lt;P&gt;Sheri&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 10pt; font-family: Courier New;"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;PROC SQL;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;CREATE TABLE&amp;nbsp; Lab_Results as&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;select LabGroup, Prov_ID,&amp;nbsp; A.Enc_ID, PATIENT_ID, A.Order_ID, Order_Date, Order_End_Date,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; L.Lab_Status_C,&amp;nbsp;&amp;nbsp; ORDER_Value, &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; L.Result_Date, L.Result_Time, Result_In_Range_YN, L.Result_Status_C &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;FROM ProdFil.EntityLabs A&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;JOIN&amp;nbsp; ProdFil.Order_Results &lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;(where=(Result_Date &amp;gt;= '01JAN2013'd)&lt;/STRONG&gt;&lt;/SPAN&gt;&amp;nbsp; L&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;on A.Enc_ID = L.Enc_ID&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;where A.Order_Proc_ID = '132856'&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;quit;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 14 Nov 2013 15:16:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Need-answers-if-you-can/m-p/142998#M38039</guid>
      <dc:creator>Sheri</dc:creator>
      <dc:date>2013-11-14T15:16:59Z</dc:date>
    </item>
  </channel>
</rss>

