<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Sort a huge dataset in Mainframe in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179613#M302828</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you Peter,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This is nice answer, But in my dataset i have close to 60 columns. Of course i will check for your option too. will let you know.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Thu, 20 Feb 2014 10:08:09 GMT</pubDate>
    <dc:creator>Subbarao</dc:creator>
    <dc:date>2014-02-20T10:08:09Z</dc:date>
    <item>
      <title>Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179608#M302823</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i have a SAS dataset in Mainframe. To sort the dataset using PROC SORT it is taking 8 hours time, as it has 40 Million records. I have tried by using options TAGSORT, THREADS, but no use of it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can any body let me know the efficient way to sort the dataset.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 19 Feb 2014 17:49:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179608#M302823</guid>
      <dc:creator>Subbarao</dc:creator>
      <dc:date>2014-02-19T17:49:42Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179609#M302824</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Are you trying to run this in interactive SAS or in a batch job?&amp;nbsp; My guess is that you are contending for&lt;/P&gt;&lt;P&gt;operating system resources / scheduling, especially if you are running interactively.&amp;nbsp; This is a more appropriate&lt;/P&gt;&lt;P&gt;activity for a batch job.&amp;nbsp;&amp;nbsp; (and then you can look at things like region size, etc)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, look at the following SAS options:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;SORTPGM=&lt;/P&gt;&lt;P&gt;SORTSIZE=&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My guess is that you have a sort package on your mainframe, and thus SORTPGM=HOST is the appropriate&lt;/P&gt;&lt;P&gt;option setting.&amp;nbsp; (I'm dredging this up from memory). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Carl&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 19 Feb 2014 18:57:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179609#M302824</guid>
      <dc:creator>carl_sommer</dc:creator>
      <dc:date>2014-02-19T18:57:31Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179610#M302825</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Apart from Carl's suggestions you could split the data in two and sort half each running in parallel at the same time in two separate programs. That should nearly halve the processing time.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Sort program1:&lt;/P&gt;&lt;P&gt;proc sort data = large (obs = half_total_obs)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; out = lib.half1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; by by_vars;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Sort program2:&lt;/P&gt;&lt;P&gt;proc sort data = large (firstobs = half_total_obs + 1)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; out = lib.half2&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; by by_vars;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Combine 2 sorts program:&lt;/P&gt;&lt;P&gt;data whole;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set&amp;nbsp; lib.half1 lib.half2;&lt;/P&gt;&lt;P&gt;&amp;nbsp; by by_vars;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 19 Feb 2014 19:19:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179610#M302825</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2014-02-19T19:19:53Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179611#M302826</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Sometimes data are partially sorted but you need that final sorting exercise.&lt;/P&gt;&lt;P&gt;If that situation occurs look for subsets that are ordered. Removing these subsets from the whole might reduce the demand for sort work space. (Thinking about those: What sort work areas sizes have you defined?)&lt;/P&gt;&lt;P&gt;Check out the companion for SAS on your mainframe.&lt;/P&gt;&lt;P&gt;good luck&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;peterC&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 19 Feb 2014 19:22:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179611#M302826</guid>
      <dc:creator>Peter_C</dc:creator>
      <dc:date>2014-02-19T19:22:16Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179612#M302827</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you Carl,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;i will try today with above options, will let you know the execution time.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 20 Feb 2014 10:06:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179612#M302827</guid>
      <dc:creator>Subbarao</dc:creator>
      <dc:date>2014-02-20T10:06:31Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179613#M302828</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank you Peter,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This is nice answer, But in my dataset i have close to 60 columns. Of course i will check for your option too. will let you know.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 20 Feb 2014 10:08:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179613#M302828</guid>
      <dc:creator>Subbarao</dc:creator>
      <dc:date>2014-02-20T10:08:09Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179614#M302829</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Thank You SASKiwi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I have tried this option for other dataset, but no use. I will try for my dataset and let you know.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 20 Feb 2014 10:09:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179614#M302829</guid>
      <dc:creator>Subbarao</dc:creator>
      <dc:date>2014-02-20T10:09:33Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179615#M302830</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;@&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;&lt;A _jive_internal="true" class="jiveTT-hover-user jive-username-link active_link" data-avatarid="-1" data-externalid="" data-presence="null" data-userid="3472" data-username="carl.sommer%40sas.com" href="https://communities.sas.com/people/carl.sommer@sas.com" id="jive-34728940741815391711"&gt;carl.sommer@sas.com&lt;/A&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;I tried by by coding options SORTPGM=HOST. But it is not working, and it was taking more time to execute.&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;Before changing option for SORTPGM, it was BEST, When it was BEST it was taking 2 Hrs to sort 10 Million records. In case of HOST it was taking ~3 Hrs.&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;@SASKiwi:&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="j-post-author "&gt;&lt;STRONG&gt;I tried this way, but no use. this method is taking same time to sort 10 miliion records, when i use proc sort option.&lt;BR /&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 28 Feb 2014 04:18:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179615#M302830</guid>
      <dc:creator>Subbarao</dc:creator>
      <dc:date>2014-02-28T04:18:28Z</dc:date>
    </item>
    <item>
      <title>Re: Sort a huge dataset in Mainframe</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179616#M302831</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Determine how large your data set is (physically) (sum of size of vars * number of records)&lt;/P&gt;&lt;P&gt;Depending on that, you might consider exporting the data to a flat file and sort that externally (linux).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In my own experience I have to say that SAS performance in z/OS is surprisingly bad, after migrating to a 2-CPU pSeries (with AIX) we noticed that it ran circles around the MF.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also keep in mind that SAS generates a utility file while sorting, and then writes the sorted data back.&lt;/P&gt;&lt;P&gt;You should make sure that the source and target of the sort are not located where your WORK library is (or the place where the UTILLOC system option points to).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;From your description I thing that you are heavily I/O bound, that's why the trick with dividing the data set did not make a difference.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Fri, 28 Feb 2014 06:18:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Sort-a-huge-dataset-in-Mainframe/m-p/179616#M302831</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2014-02-28T06:18:29Z</dc:date>
    </item>
  </channel>
</rss>

