<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Very large data files in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676664#M204052</link>
    <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN title=""&gt;Write what comes to your mind in this specific topic or indicate places where I can find something valuable in your opinion and what I could have missed&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN title=""&gt;-I used the hash table&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN class="" title=""&gt;-I got interested in sasphile but it has big limitations in my opinion&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 14 Aug 2020 09:17:10 GMT</pubDate>
    <dc:creator>makset</dc:creator>
    <dc:date>2020-08-14T09:17:10Z</dc:date>
    <item>
      <title>Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676545#M204016</link>
      <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN title=""&gt;I have several data files about 1 TB in total (16 files).&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN title=""&gt;I want to have access to the data contained therein as soon as possible.&lt;/SPAN&gt; &lt;SPAN title=""&gt;I am asking for some advice and thoughts.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN class="" title=""&gt;my workstation:&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;W8.1Pro&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;sas 9.2 &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;2x xeon e5-2630 v3&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;RAM 64GB&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;nvme samsung mzvpv256&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;nvme wds100t3xoc-00sj 1TB&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;Best regards and thank you in advance for your help&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Aug 2020 16:46:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676545#M204016</guid>
      <dc:creator>makset</dc:creator>
      <dc:date>2020-08-13T16:46:38Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676556#M204018</link>
      <description>SAS processes data row by row so some things work quite well on a desktop regardless of size, but take time. &lt;BR /&gt;&lt;BR /&gt;Are these SAS files or text files? And most importantly - what's your question? What do you need help with?</description>
      <pubDate>Thu, 13 Aug 2020 17:22:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676556#M204018</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-08-13T17:22:31Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676565#M204021</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Are these SAS files or text files?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;sas file&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;what's your question? What do you need help with?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN title=""&gt;The question is not about a specific problem.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN title=""&gt;I have, for example, 16 cores and a fairly fast nvme, so maybe I will divide each file into 16 equal parts and process them in parallel?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN title=""&gt;maybe there is another way?&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN title=""&gt;maybe a hasch table?&lt;/SPAN&gt; &lt;SPAN title=""&gt;I'm learning now&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN title=""&gt;best regards&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Aug 2020 17:45:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676565#M204021</guid>
      <dc:creator>makset</dc:creator>
      <dc:date>2020-08-13T17:45:59Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676577#M204023</link>
      <description>&lt;P&gt;It all depends on what you are trying to do.&amp;nbsp; In general it might be best to summarize the data into much smaller size and then do your analysis from the summary.&amp;nbsp; But whether that is possible or how to do it depends on what analysis you are doing.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Aug 2020 18:41:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676577#M204023</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-08-13T18:41:29Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676582#M204025</link>
      <description>A vague, general question gets a vague general answer. &lt;BR /&gt;</description>
      <pubDate>Thu, 13 Aug 2020 19:14:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676582#M204025</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-08-13T19:14:31Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676609#M204033</link>
      <description>&lt;P&gt;I doubt there is hash functionality in SAS 9.2.&lt;/P&gt;
&lt;P&gt;I recommend you get your SAS software up-to-date. I would expect SAS 9.4 to be more efficient and offer more capabilities than 9.2.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Aug 2020 22:34:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676609#M204033</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-08-13T22:34:48Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676636#M204040</link>
      <description>&lt;P&gt;I would start by upgrading to the latest sas version. Both nvme you mention are way to small to hold all dataset you want to process, so where is the data stored?&lt;/P&gt;</description>
      <pubDate>Fri, 14 Aug 2020 05:13:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676636#M204040</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2020-08-14T05:13:33Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676662#M204050</link>
      <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN title=""&gt;Migration to sas 9.4 (TS1M6) is not possible yet, I don't have time, maybe in October.&lt;/SPAN&gt; &lt;SPAN class="" title=""&gt;Anyway, I was thinking to rewrite everything to c ++, but so far I don't have much experience with c ++.&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN class="" title=""&gt;system is on nvme samsung mzvpv256&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN class="" title=""&gt;All the data is stored here nvme wds100t3xoc-00sj 1TB, and I have one more&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Aug 2020 09:11:24 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676662#M204050</guid>
      <dc:creator>makset</dc:creator>
      <dc:date>2020-08-14T09:11:24Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676663#M204051</link>
      <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;What in sas 9.4 (TS1M6) is so good (better than 9.2)&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Aug 2020 09:13:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676663#M204051</guid>
      <dc:creator>makset</dc:creator>
      <dc:date>2020-08-14T09:13:02Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676664#M204052</link>
      <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN title=""&gt;Write what comes to your mind in this specific topic or indicate places where I can find something valuable in your opinion and what I could have missed&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN title=""&gt;-I used the hash table&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN class="" title=""&gt;-I got interested in sasphile but it has big limitations in my opinion&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Aug 2020 09:17:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676664#M204052</guid>
      <dc:creator>makset</dc:creator>
      <dc:date>2020-08-14T09:17:10Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676909#M204123</link>
      <description>&lt;P&gt;I suggest you refer to the SAS documentation for a complete list of improvements (documentation.sas.com). There are hundreds if not thousands of them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I'd expect SAS 9.4 to be significantly faster than 9.2 as well, although that may depend on the type of processing you are doing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BTW I just checked and you are in luck.&amp;nbsp;&lt;SPAN&gt;DATA Step HASH was implemented in SAS 9.1 so you will have it in 9.2:&amp;nbsp;&lt;A href="https://support.sas.com/kb/11/391.html" target="_blank"&gt;https://support.sas.com/kb/11/391.html&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 15 Aug 2020 00:37:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676909#M204123</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-08-15T00:37:31Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676910#M204124</link>
      <description>&lt;P&gt;IMO,it sounds like you don't have enough storage regardless of what language you use to process your data files. If your 16 files completely fill your disk drive you have no room to do anything else.&lt;/P&gt;</description>
      <pubDate>Sat, 15 Aug 2020 00:28:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676910#M204124</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-08-15T00:28:29Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676932#M204138</link>
      <description>&lt;P&gt;If I understand your spec's correctly you have 1.25TB of available storage in which you want to store and use some number of SAS datasets the you say adds up to 1TB.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Depending on what you want to do with the datasets, you might run out of disk space.&amp;nbsp; Say one of the datasets is .2TB, and you want to sort it.&amp;nbsp; That process, at some point, would need about .2TB for intermediate sort utility files, and .2TB for the new sorted dataset prior to deleting the intermediates.&amp;nbsp;&amp;nbsp; I.e. you could easily have a point at which you require 1.4TB.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Aside from sorting issues, in the case of many SAS programs, users often generate lots of temporary sas datasets (in the work libname, if nowhere else), over the course of a multi-step program.&amp;nbsp; I can easily see the possibility of running into disk space limitations without conducting more than casual housekeeping.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 15 Aug 2020 05:27:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676932#M204138</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-08-15T05:27:33Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676933#M204139</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;hash functionality has been there longer than you might think.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;According to &lt;A href="https://www.lexjansen.com/nesug/nesug09/hw/HW04.pdf" target="_self"&gt;The SAS Hash Object in Action:&amp;nbsp;&lt;/A&gt;&lt;/P&gt;
&lt;DIV class="site-wrapper  js-site-wrapper" style="min-height: 1052px;"&gt;
&lt;DIV id="web_content_wrapper" class="content-wrap"&gt;
&lt;DIV class="cw"&gt;
&lt;DIV id="links_wrapper" class="serp__results js-serp-results"&gt;
&lt;DIV class="results--main"&gt;
&lt;DIV id="links" class="results js-results"&gt;
&lt;DIV id="r1-1" class="result results_links_deep highlight_d result--url-above-snippet highlight" data-domain="www.lexjansen.com" data-hostname="www.lexjansen.com" data-nir="1"&gt;
&lt;DIV class="result__body links_main links_deep"&gt;
&lt;BLOCKQUOTE&gt;
&lt;DIV class="result__snippet js-result-snippet"&gt;In &lt;STRONG&gt;SAS&lt;/STRONG&gt;® &lt;STRONG&gt;Version&lt;/STRONG&gt; 9.1, the &lt;STRONG&gt;hash&lt;/STRONG&gt; table - the very first &lt;STRONG&gt;object&lt;/STRONG&gt; &lt;STRONG&gt;introduced&lt;/STRONG&gt; via the DATA Step Component Interface in &lt;STRONG&gt;Version&lt;/STRONG&gt; 9.0 - has finally become robust and syntactically stable&lt;/DIV&gt;
&lt;/BLOCKQUOTE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Sat, 15 Aug 2020 05:32:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676933#M204139</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-08-15T05:32:01Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676941#M204144</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt; - Yes, see my follow up post.&lt;/P&gt;</description>
      <pubDate>Sat, 15 Aug 2020 08:08:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676941#M204144</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-08-15T08:08:29Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676952#M204147</link>
      <description>&lt;P&gt;&lt;SPAN class="tlid-translation translation"&gt;&lt;SPAN class="" title=""&gt;I'm a bit sorry because you're focusing on what's the least important (but I understand it's important).&lt;/SPAN&gt; &lt;SPAN title=""&gt;I still don't have 1TB of data, but soon.&lt;/SPAN&gt; &lt;SPAN class="" title=""&gt;I will just buy the next nvme.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 15 Aug 2020 11:42:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/676952#M204147</guid>
      <dc:creator>makset</dc:creator>
      <dc:date>2020-08-15T11:42:29Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/677010#M204166</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17744"&gt;@makset&lt;/a&gt;&amp;nbsp; - If we are giving you the wrong answers it is because you haven't fully explained what your problem is. You've listed your PC hardware and explained the size of data you have but you haven't said anything about what you want to do with it. Until you explain what data processing you want to do then we can't provide better advice. BTW hashing isn't a magic wand to improve all types of processing performance.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Processing 1TB of data on a PC with SAS is entirely possible and many PC SAS users have bigger data volumes. As already explained you will probably need at least 2 or 3TB of free space for further processing as well as 1TB just for your source files. How much extra depends on what type of processing you want to do. &amp;nbsp; &lt;/P&gt;</description>
      <pubDate>Sun, 16 Aug 2020 00:10:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/677010#M204166</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2020-08-16T00:10:07Z</dc:date>
    </item>
    <item>
      <title>Re: Very large data files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/677021#M204174</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/17744"&gt;@makset&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your question is way to vague and generic to propose anything else than reading/books. So if it's about dealing with large data sets then you might want to Google for SAS topics dealing with performance - and there's lot about this out there.&lt;/P&gt;
&lt;P&gt;Also spending time reading&amp;nbsp;&lt;A href="https://go.documentation.sas.com/?docsetId=lrcon&amp;amp;docsetTarget=titlepage.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_self"&gt;SAS® 9.4 Language Reference: Concepts, Sixth Edition&lt;/A&gt;&amp;nbsp;is likely valuable for you as it explains how SAS actually works and how it processes data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;About SAS hashes: They get loaded into memory with fully expanded column length so likely not suitable for your huge tables unless you've got actually the required memory available.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Aug 2020 21:16:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Very-large-data-files/m-p/677021#M204174</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-08-20T21:16:03Z</dc:date>
    </item>
  </channel>
</rss>

