<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Importing large data set issue ? in SAS Data Management</title>
    <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324555#M9386</link>
    <description>&lt;P&gt;1000 GB / 30M rows = ~33000 bytes per row&lt;/P&gt;
&lt;P&gt;33000 / 100 = 330 average variable length, so you have mostly quite long character variables.&lt;/P&gt;
&lt;P&gt;Absolutely needs to be stored with the compress=yes option.&lt;/P&gt;
&lt;P&gt;Inspect the structure in the SQL database, and inspect the data to look if the defined lengths are needed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 13 Jan 2017 14:00:05 GMT</pubDate>
    <dc:creator>Kurt_Bremser</dc:creator>
    <dc:date>2017-01-13T14:00:05Z</dc:date>
    <item>
      <title>Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324546#M9383</link>
      <description>&lt;P&gt;I have a table in sql in which i have i have 30 million rows and 100 columns. So the problem is the required size to store the entire table in sas server is ~1TB which is quite large.&lt;BR /&gt;&lt;BR /&gt;Can anybody explain why it is taking such huge space to store the data. This data is not compressed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Processes that i have used to bring data is by writing program in EG. and setting the store location in server.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Jan 2017 13:34:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324546#M9383</guid>
      <dc:creator>ankit___gupta</dc:creator>
      <dc:date>2017-01-13T13:34:34Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324550#M9385</link>
      <description>&lt;P&gt;30 million rows is quite small, to get to 999gb the width of each row would be 33,000 bytes. I think you should normalize or look at a set of tables. Note that unless you need all 33,000 bytes separate tables would be much faster than processing 33,000 bytes records.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Even the census full sf1 detail is not this wide.&lt;/P&gt;</description>
      <pubDate>Fri, 13 Jan 2017 13:48:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324550#M9385</guid>
      <dc:creator>rogerjdeangelis</dc:creator>
      <dc:date>2017-01-13T13:48:09Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324555#M9386</link>
      <description>&lt;P&gt;1000 GB / 30M rows = ~33000 bytes per row&lt;/P&gt;
&lt;P&gt;33000 / 100 = 330 average variable length, so you have mostly quite long character variables.&lt;/P&gt;
&lt;P&gt;Absolutely needs to be stored with the compress=yes option.&lt;/P&gt;
&lt;P&gt;Inspect the structure in the SQL database, and inspect the data to look if the defined lengths are needed.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 13 Jan 2017 14:00:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324555#M9386</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2017-01-13T14:00:05Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324559#M9387</link>
      <description>&lt;P&gt;Hi Ankit,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Have a look at this macro &lt;A href="http://support.sas.com/kb/24/804.html" target="_self"&gt;http://support.sas.com/kb/24/804.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;It should help you with optimizing the storage requirements for your table.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck,&lt;/P&gt;
&lt;P&gt;Ahmed&lt;/P&gt;</description>
      <pubDate>Fri, 13 Jan 2017 14:05:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324559#M9387</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2017-01-13T14:05:15Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324773#M9390</link>
      <description>&lt;P&gt;Additionally to what&amp;nbsp;others already wrote: If you look directly at the database definitions of a character variable and then look at the definition of the corresponding SAS variable, do you see any multiplication of lengths?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Example: Do you see something like SQL_Var with Varchar(100) and then SAS_Var with a length of $400?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can investigate this for any character variable as if this happens, it's going to apply for all of them.&lt;/P&gt;</description>
      <pubDate>Sat, 14 Jan 2017 01:30:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324773#M9390</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2017-01-14T01:30:56Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324784#M9391</link>
      <description>&lt;P&gt;As well as trying&amp;nbsp;&lt;EM&gt;compress=yes&lt;/EM&gt;, try&amp;nbsp;&lt;EM&gt;compress=binary&lt;/EM&gt; - this can work well with very long rows (lots of columns).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Additionally, if you've got date or time fields (not datetime), they can&amp;nbsp;&lt;EM&gt;always&lt;/EM&gt; be stored in four bytes instead of the default of eight. If you're using the wizard to import the data, you may not have the control to change these lengths, and you might have to write a specific data step (easier to control than SQL, quite possibly) to modify the metadata.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your import can't change metadata but can create a view, then you can do what you like in the following step (dropping variables, turning compression on, modifying variable lengths and so on).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;(Sorry, I don't have EG, nor do I know what DBMS your source data is in, so I can't help further - yet!)&lt;/P&gt;</description>
      <pubDate>Sat, 14 Jan 2017 02:59:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324784#M9391</guid>
      <dc:creator>LaurieF</dc:creator>
      <dc:date>2017-01-14T02:59:38Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324812#M9392</link>
      <description>&lt;P&gt;All good answers to the problem. But it boggles my mind, that entier data on on my sql server&amp;nbsp;only takes up 700 GB of hard didk space, and then there is this one table from the same group is bloating upto 1TB in SAS.&lt;BR /&gt;&lt;BR /&gt;I will definately investigate the variables in my dataset.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But if i would like to boil down the question further as to why it needs to be so humongous ?&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;If it is for performance, i don't see it working, as operations on datasets containing 2 million rows are taking ~2-3 mins. which would be quite less in SQL.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 14 Jan 2017 10:11:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324812#M9392</guid>
      <dc:creator>ankit___gupta</dc:creator>
      <dc:date>2017-01-14T10:11:02Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324822#M9393</link>
      <description>So, there's an architectural difference between SAS and SQL Server, where I can agree upon that the VAR CHAR is a bit more flexible and has a less cost than the SAS counterpart COMPRESS. You have already been been given a few gid how to's to manage data storage efficiently in SAS. &lt;BR /&gt;&lt;BR /&gt;But what is your actual problem? A storage difference around 30% isn't much to be concerned about, IMO definitely not homongous. If you have an adequate performance comparison feel free to share. Until this day I have never seen an SQL Server query outrun SAS on similar/same HW. If performance is an issue consider store in a SAS SPDE library.</description>
      <pubDate>Sat, 14 Jan 2017 12:51:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324822#M9393</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2017-01-14T12:51:09Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324829#M9394</link>
      <description>&lt;P&gt;Varchar is used in databases to conserve unused space when strings don't occupy the whole defined length. Since SAS does not have variable-length strings, other mechanism need to be used (compressing the dataset). Also keep in mind that date and time values that only need 4 bytes in a DBMS are stored with 8 bytes per default in SAS, so you should use a lengt statement to reduce them in size.&lt;/P&gt;
&lt;P&gt;Is the 1 TB that you get the compressed size?&lt;/P&gt;</description>
      <pubDate>Sat, 14 Jan 2017 14:52:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324829#M9394</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2017-01-14T14:52:44Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324874#M9395</link>
      <description>Thanks LinusH, will look into the SAS SPDE library. Please if you have any other suggestion on how to judge your SAS performance let me know.&lt;BR /&gt;&lt;BR /&gt;Thanks for your reply.&lt;BR /&gt;</description>
      <pubDate>Sun, 15 Jan 2017 08:55:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324874#M9395</guid>
      <dc:creator>ankit___gupta</dc:creator>
      <dc:date>2017-01-15T08:55:28Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324974#M9400</link>
      <description>Judge performance:&lt;BR /&gt;Chose a couple of typical/critical queries and run them with the same set of data and indexing in both environments  preferably with comparable hw.</description>
      <pubDate>Mon, 16 Jan 2017 07:08:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324974#M9400</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2017-01-16T07:08:10Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324975#M9401</link>
      <description>And, don't forget loading if that is a regular process.</description>
      <pubDate>Mon, 16 Jan 2017 07:09:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/324975#M9401</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2017-01-16T07:09:05Z</dc:date>
    </item>
    <item>
      <title>Re: Importing large data set issue ?</title>
      <link>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/325261#M9412</link>
      <description>It sounds as if you have some Character variables that is set to a very long length. Even one or two of this that gets to 32000 and no compression can give a humongous data set.</description>
      <pubDate>Tue, 17 Jan 2017 12:56:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Data-Management/Importing-large-data-set-issue/m-p/325261#M9412</guid>
      <dc:creator>PaalNavestad</dc:creator>
      <dc:date>2017-01-17T12:56:35Z</dc:date>
    </item>
  </channel>
</rss>

