<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Very Large Dataset - Out of Resources in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736410#M28718</link>
    <description>&lt;P&gt;I have a dataset that is 948,036 obs by 221 var. I need each obs to have a day value from 1/1/2008 to 12/31/2017. The output dataset would be 3,462,227,000 obs by 221 var.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (compress=yes);
	set have;
    	do day='01JAN2008'd to '31DEC2017'd;
			output;
	end;
	format day mmddyy10.;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;When I run this code I get the error below.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sas_out_of_resources.PNG" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/58611iB90BB2A170A4705E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="sas_out_of_resources.PNG" alt="sas_out_of_resources.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have 16 gb of ram and about 1tb of storage. Is the limitation my device or SAS, and are there any ways to work with datasets this large in SAS?&lt;/P&gt;</description>
    <pubDate>Thu, 22 Apr 2021 14:09:05 GMT</pubDate>
    <dc:creator>JJ_83</dc:creator>
    <dc:date>2021-04-22T14:09:05Z</dc:date>
    <item>
      <title>Very Large Dataset - Out of Resources</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736410#M28718</link>
      <description>&lt;P&gt;I have a dataset that is 948,036 obs by 221 var. I need each obs to have a day value from 1/1/2008 to 12/31/2017. The output dataset would be 3,462,227,000 obs by 221 var.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (compress=yes);
	set have;
    	do day='01JAN2008'd to '31DEC2017'd;
			output;
	end;
	format day mmddyy10.;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;When I run this code I get the error below.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sas_out_of_resources.PNG" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/58611iB90BB2A170A4705E/image-size/medium?v=v2&amp;amp;px=400" role="button" title="sas_out_of_resources.PNG" alt="sas_out_of_resources.PNG" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have 16 gb of ram and about 1tb of storage. Is the limitation my device or SAS, and are there any ways to work with datasets this large in SAS?&lt;/P&gt;</description>
      <pubDate>Thu, 22 Apr 2021 14:09:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736410#M28718</guid>
      <dc:creator>JJ_83</dc:creator>
      <dc:date>2021-04-22T14:09:05Z</dc:date>
    </item>
    <item>
      <title>Re: Very Large Dataset - Out of Resources</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736414#M28719</link>
      <description>&lt;P&gt;What SAS environment are you using?&lt;/P&gt;
&lt;P&gt;If you are connecting to a SAS server your SAS admin is likely to have a limit on how much space you are allowed to use.&lt;/P&gt;
&lt;P&gt;If you are using SAS On demand I believe there are limits on work space.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Why are you duplicating every value in your existing data 36500 times except for that date? Sounds like a not well thought out problem.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Apr 2021 14:15:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736414#M28719</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-04-22T14:15:27Z</dc:date>
    </item>
    <item>
      <title>Re: Very Large Dataset - Out of Resources</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736417#M28721</link>
      <description>&lt;P&gt;Take the sizes of the variables (numeric needs 8 byte per variable), add them up, and multiply by the expected number of observations, so you have a rough idea of the disk space needed.&lt;/P&gt;
&lt;P&gt;A quick calculation, assuming all 221 variables to be numbers, is this:&lt;/P&gt;
&lt;P&gt;221 * 8 * 948036 * 3650&lt;/P&gt;
&lt;P&gt;and results in a size of roughly 5.7 TB. Add a little overhead, and you'll need at least 6 TB to store that. But to WORK with it, you'll need at least three times the size, and a really hefty computer to process this much in tolerable time.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What good is this mass of redundancy, anyway?&lt;/P&gt;</description>
      <pubDate>Thu, 22 Apr 2021 14:23:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736417#M28721</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2021-04-22T14:23:40Z</dc:date>
    </item>
    <item>
      <title>Re: Very Large Dataset - Out of Resources</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736422#M28722</link>
      <description>&lt;P&gt;1) "&lt;SPAN&gt;I need each obs to have a day value from 1/1/2008 to 12/31/2017&lt;/SPAN&gt;" - why? is it necessary?&lt;/P&gt;
&lt;P&gt;2) even if each variable would be no bigger than 2 bytes, still 1tb may not be enough...&lt;/P&gt;</description>
      <pubDate>Thu, 22 Apr 2021 14:35:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736422#M28722</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2021-04-22T14:35:45Z</dc:date>
    </item>
    <item>
      <title>Re: Very Large Dataset - Out of Resources</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736436#M28723</link>
      <description>May be a case where you need to loop and deal with each observation individually. More details are needed. Usually the recommendation is to not be loopy but if you are under resourced for your problem, not much choice.</description>
      <pubDate>Thu, 22 Apr 2021 15:22:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736436#M28723</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-04-22T15:22:00Z</dc:date>
    </item>
    <item>
      <title>Re: Very Large Dataset - Out of Resources</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736441#M28724</link>
      <description>&lt;P&gt;Here's an idea:&lt;/P&gt;
&lt;P&gt;Create a very small data set, or use one of the SAS supplied data sets like SASHELP.CLASS , CARS or STOCKS and add maybe 30 dates to the data.&lt;/P&gt;
&lt;P&gt;Then show us what you intended to do with your original data using this much reduced data set. We might have some other suggestions.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Apr 2021 15:35:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736441#M28724</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-04-22T15:35:19Z</dc:date>
    </item>
    <item>
      <title>Re: Very Large Dataset - Out of Resources</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736468#M28728</link>
      <description>&lt;P&gt;I agree with&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/35763"&gt;@yabwon&lt;/a&gt;.&amp;nbsp; &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your problem may very well be more about how you want to organize data to accomplish your task, than about disk space.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The revealing statement that you want 3,652 days for each incoming obs, multiplying the size of your data set by 3,652, does not have an obvious justification to me.&amp;nbsp; &amp;nbsp;After all, you are repeating the other 220 variables as constants - that's a lot of needless duplication.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What is the intended use of this gigantic data set?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 22 Apr 2021 17:11:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Very-Large-Dataset-Out-of-Resources/m-p/736468#M28728</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2021-04-22T17:11:57Z</dc:date>
    </item>
  </channel>
</rss>

