<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Issue in sorting a large dataset in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148943#M39351</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;HI,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; I am trying to sort a large dataset with 367538640 rows . Sort is getting failed because of the space issue. Tried using options compress =yes and tagsort . Tagsort is taking very long time. Please suggest any alternatives.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 8pt;"&gt;&lt;SPAN lang="EN"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 07 May 2014 14:55:50 GMT</pubDate>
    <dc:creator>archana</dc:creator>
    <dc:date>2014-05-07T14:55:50Z</dc:date>
    <item>
      <title>Issue in sorting a large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148943#M39351</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;HI,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt; I am trying to sort a large dataset with 367538640 rows . Sort is getting failed because of the space issue. Tried using options compress =yes and tagsort . Tagsort is taking very long time. Please suggest any alternatives.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 8pt;"&gt;&lt;SPAN lang="EN"&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt; &lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 07 May 2014 14:55:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148943#M39351</guid>
      <dc:creator>archana</dc:creator>
      <dc:date>2014-05-07T14:55:50Z</dc:date>
    </item>
    <item>
      <title>Re: Issue in sorting a large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148944#M39352</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;It is likely that the problem is that SAS creates a copy of the dataset, which will overwrite the original data when the sort procedure is finish. Therefore, if you have a system where there is more space on the drive where permanent data is supposed to be saved, then your can tell sas that it should use that drive as a work-directory. But, remember to change back afterwards.&lt;/P&gt;&lt;P&gt;If you use windows you should add&lt;/P&gt;&lt;P&gt;-work "d:\path_to_temporary_workfolder"&lt;/P&gt;&lt;P&gt;in the command line from where you start sas.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;see the documentation here: &lt;A href="http://support.sas.com/documentation/cdl/en/lesysoptsref/66899/HTML/default/viewer.htm#p1er6tm8fay8u2n1fhktmeoy2be4.htm"&gt;http://support.sas.com/documentation/cdl/en/lesysoptsref/66899/HTML/default/viewer.htm#p1er6tm8fay8u2n1fhktmeoy2be4.htm&lt;/A&gt;&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 07 May 2014 16:54:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148944#M39352</guid>
      <dc:creator>JacobSimonsen</dc:creator>
      <dc:date>2014-05-07T16:54:48Z</dc:date>
    </item>
    <item>
      <title>Re: Issue in sorting a large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148945#M39353</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;see also: &lt;A _jive_internal="true" class="active_link" href="https://communities.sas.com/message/209847#209847"&gt;https://communities.sas.com/message/209847#209847&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Tagsort and compressing the definitive dataset will not help you much.&lt;/P&gt;&lt;P&gt;The sorting requires apx 3 times the sizing of the original dataset as intermediate work.&lt;/P&gt;&lt;P&gt;Overwriting the original datasets is adding the need of one additional copy.You can redirect the intermediate work to an other location using utilloc system option.&lt;/P&gt;&lt;P&gt;I am assuming you are using a server of some kind with a limited setup in this 365M records is a big number what is the size of that? if a recordsize is 100 bytes it should by 36Gb.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Unless your logical requirement is absolutely needing the sort there are possible better solutions to your original question.&lt;/P&gt;&lt;P&gt;Needing this sort really, you could try to split this big data set in multiple smaller ones and merge the several sorted smaller ones in a dedicated step.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 07 May 2014 20:12:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148945#M39353</guid>
      <dc:creator>jakarman</dc:creator>
      <dc:date>2014-05-07T20:12:56Z</dc:date>
    </item>
    <item>
      <title>Re: Issue in sorting a large dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148946#M39354</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;UTILLOC in the configuration file allows you to specify a location different from WORK for the temporary sort file. This will reduce the requirement for the file to be sorted to 2x.&lt;/P&gt;&lt;P&gt;If you do&lt;/P&gt;&lt;P&gt;proc sort data=x1.xxx out=x2.xxx;&lt;/P&gt;&lt;P&gt;where x1 and x2 are libraries on different file systems, this may also help preventing an out of space condition, because you "only" need the size of xxx to be free one time in the UTILLOC and the x2 location, alike.&lt;/P&gt;&lt;P&gt;Then I recommend what Jaap suggested, split the file, sort every partial file on its own, and then do:&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt;set&lt;/P&gt;&lt;P&gt;&amp;nbsp; have1&lt;/P&gt;&lt;P&gt;&amp;nbsp; have2&lt;/P&gt;&lt;P&gt;&amp;nbsp; ...&lt;/P&gt;&lt;P&gt;&amp;nbsp; haven&lt;/P&gt;&lt;P&gt;;&lt;/P&gt;&lt;P&gt;by sortcrit;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;This is called interleaving, the sort order is preserved.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Thu, 08 May 2014 05:33:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Issue-in-sorting-a-large-dataset/m-p/148946#M39354</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2014-05-08T05:33:10Z</dc:date>
    </item>
  </channel>
</rss>

