<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic sorting large dataset in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897277#M354552</link>
    <description>&lt;DIV&gt;Hello - I'm trying to run the below code but its taking hours.&amp;nbsp; What can I do to improve the efficiency of this program so it doesn't take so long to run?&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Thanks!&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;%let user = xxxxx;&lt;/DIV&gt;
&lt;DIV&gt;%let pswd = "%STR({SAS002}EE9F3A145162A68C09E176074A3382401380A5B759B196C62EF8B5D5)";&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;libname here '/workspace/cucr/risk/userid/xxxxx';&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;proc sql;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; options nomprint;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; connect to oracle as adm (USER=&amp;amp;user. PASSWORD=&amp;amp;pswd. PATH="prod");&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; options mprint;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; create table here.v_chd_mini_month_end (compress=yes) as select * from connection to adm&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; (select chd_open_date, tape_date, chd_credit_line, chd_account_number_ref, chd_cr_line_date&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; from adm.v_chd_mini_month_end&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; order by chd_account_number_ref, tape_date);&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; disconnect from adm;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;quit;&lt;/DIV&gt;</description>
    <pubDate>Wed, 04 Oct 2023 21:02:57 GMT</pubDate>
    <dc:creator>aamoen</dc:creator>
    <dc:date>2023-10-04T21:02:57Z</dc:date>
    <item>
      <title>sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897277#M354552</link>
      <description>&lt;DIV&gt;Hello - I'm trying to run the below code but its taking hours.&amp;nbsp; What can I do to improve the efficiency of this program so it doesn't take so long to run?&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Thanks!&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;%let user = xxxxx;&lt;/DIV&gt;
&lt;DIV&gt;%let pswd = "%STR({SAS002}EE9F3A145162A68C09E176074A3382401380A5B759B196C62EF8B5D5)";&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;libname here '/workspace/cucr/risk/userid/xxxxx';&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;proc sql;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; options nomprint;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; connect to oracle as adm (USER=&amp;amp;user. PASSWORD=&amp;amp;pswd. PATH="prod");&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; options mprint;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; create table here.v_chd_mini_month_end (compress=yes) as select * from connection to adm&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; (select chd_open_date, tape_date, chd_credit_line, chd_account_number_ref, chd_cr_line_date&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; from adm.v_chd_mini_month_end&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; order by chd_account_number_ref, tape_date);&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt; disconnect from adm;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;quit;&lt;/DIV&gt;</description>
      <pubDate>Wed, 04 Oct 2023 21:02:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897277#M354552</guid>
      <dc:creator>aamoen</dc:creator>
      <dc:date>2023-10-04T21:02:57Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897278#M354553</link>
      <description>&lt;P&gt;I think your main issue is not the sort within the database, but the connection between the SAS and Oracle servers.&lt;/P&gt;
&lt;P&gt;Hints for tuning SAS/ACCESS to Oracle:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/acreldb/p0fmgbpjwgbnvkn1sysefnr22fow.htm" target="_blank" rel="noopener"&gt;https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/acreldb/p0fmgbpjwgbnvkn1sysefnr22fow.htm&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Oct 2023 21:09:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897278#M354553</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2023-10-04T21:09:25Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897279#M354554</link>
      <description>&lt;P&gt;To see if the problem is a slow network between the Oracle and SAS servers try just getting a row count:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
options nomprint;
connect to oracle as adm (USER=&amp;amp;user. PASSWORD=&amp;amp;pswd. PATH="prod");
options mprint;
 
create table here.v_chd_mini_month_end (compress=yes) as select * from connection to adm
(count(*) as row_count
from adm.v_chd_mini_month_end
);
 
disconnect from adm;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If this is slow, then talk to your Oracle DBA as it is Oracle that is slow, not SAS.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Oct 2023 21:11:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897279#M354554</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2023-10-04T21:11:54Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897280#M354555</link>
      <description>&lt;P&gt;When I tried that I recieved the following err:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;ERROR: ORACLE prepare error: ORA-24333: zero iteration count. SQL statement: count(*) as row_count from adm.v_chd_mini_month_end.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Oct 2023 21:23:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897280#M354555</guid>
      <dc:creator>aamoen</dc:creator>
      <dc:date>2023-10-04T21:23:03Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897282#M354557</link>
      <description>&lt;P&gt;Sorry, I forgot the SELECT in front:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;select count(*) as row_count&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 04 Oct 2023 21:34:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897282#M354557</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2023-10-04T21:34:51Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897283#M354558</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/331774"&gt;@aamoen&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Have a look at these two papers&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;&lt;A title="SAS-Oracle Options and Efficiency: What You Don’t Know Can Hurt You" href="https://support.sas.com/resources/papers/proceedings13/072-2013.pdf" target="_blank" rel="noopener"&gt;SAS-Oracle Options and Efficiency: What You Don’t Know Can Hurt You&lt;/A&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;A title="Tips for Pulling Data from Oracle® Using PROC SQL® Pass-Through" href="https://www.lexjansen.com/sesug/2018/SESUG2018_Paper-261_Final_PDF.pdf" target="_blank" rel="noopener"&gt;Tips for Pulling Data from Oracle® Using PROC SQL® Pass-Through&lt;/A&gt;&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;Here is what you need to pay attention to&lt;/P&gt;
&lt;P&gt;- &lt;STRONG&gt;ReadBuff/BuffSize&lt;/STRONG&gt;: How many Oracle Rows/Records are read for each fetch operation. Default:250 (Too-Little! --- Increase this if you can)&lt;/P&gt;
&lt;P&gt;- &lt;STRONG&gt;SASTrace option&lt;/STRONG&gt;: Tells you what's going on and where the query is running&lt;/P&gt;
&lt;P&gt;- &lt;STRONG&gt;Oracle Hints&lt;/STRONG&gt;: They could optimize Oracle execution, and speeds up your response times&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Hope this helps,&lt;/P&gt;
&lt;P&gt;Ahmed&lt;/P&gt;</description>
      <pubDate>Wed, 04 Oct 2023 21:35:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897283#M354558</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2023-10-04T21:35:53Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897296#M354563</link>
      <description>&lt;P&gt;I concur with others that the bottleneck is normally the data transfer from the database to SAS. In my experience increasing the value for readbuff is what often decreases elapsed time dramatically. The default value is almost always way too low.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 04:12:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897296#M354563</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2023-10-05T04:12:50Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897323#M354575</link>
      <description>&lt;P&gt;Your headline reads "Sorting large dataset" - have you tried not having Oracle sort the data (drop the ORDER BY clause), and sort in SAS afterwards. If it is the actual Oracle server (and not the connection) which is slow, that may help on the performance. But I would probably try looking at the connection options first.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 08:29:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897323#M354575</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2023-10-05T08:29:29Z</dc:date>
    </item>
    <item>
      <title>Re: sorting large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897353#M354592</link>
      <description>&lt;P&gt;My question is how you intend to use the result table?&lt;/P&gt;
&lt;P&gt;The general idea is to keep large tables in the source until you need them for your final step. In this case you might continue with filtering, joining and what not.&lt;/P&gt;
&lt;P&gt;Be aware of that SQL implict passthrough automatically sorts your data if the database libname engine encounters a BY statement (PROC, data step etc).&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 10:57:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/sorting-large-dataset/m-p/897353#M354592</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2023-10-05T10:57:41Z</dc:date>
    </item>
  </channel>
</rss>

