<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using NLEVELS with very large datasets: insufficient memory in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Using-NLEVELS-with-very-large-datasets-insufficient-memory/m-p/518289#M140259</link>
    <description>&lt;P&gt;1. You could increase the memory allocated to SAS.&lt;/P&gt;
&lt;P&gt;2. You could reduce the number of variables analysed in one go, for example&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;tables _NUMERIC_&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;then&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;&lt;SPAN&gt;tables _CHAR_&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;instead of&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;&lt;SPAN&gt;tables _ALL_&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;3. You could&amp;nbsp;remove obviously high-cardinality&amp;nbsp;variables. For example BANK_ACCT_BALANCE will obviously&amp;nbsp;have a very high NLEVELS whose calculation brings no value.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;4. proc sql allows you to calculate&amp;nbsp;&lt;SPAN&gt;number of levels, missing levels, non-missing levels, number of observations, and the ratio of levels to observations in one step. You need to explicitly&amp;nbsp;code the variable names though, and the memory issue remains. But at least you only have one pass per variable.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 04 Dec 2018 20:54:46 GMT</pubDate>
    <dc:creator>ChrisNZ</dc:creator>
    <dc:date>2018-12-04T20:54:46Z</dc:date>
    <item>
      <title>Using NLEVELS with very large datasets: insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-NLEVELS-with-very-large-datasets-insufficient-memory/m-p/518284#M140256</link>
      <description>&lt;P&gt;I am working with many large datasets. Each dataset has millions of observations, and they range in size from 1GB to more than 100GB. I need generate for each variable the number of levels, missing levels, non-missing levels, number of observations, and the ratio of levels to observations. I am using the following code. However, SAS frequently reports insufficient memory and stops processing. I would like to learn a more efficient approach to producing the variable descriptions without running out of memory. Any suggestions would be appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt; &lt;STRONG&gt;freq&lt;/STRONG&gt; nlevels data= mydata &amp;nbsp;;&lt;/P&gt;&lt;P&gt;&amp;nbsp; ods output nlevels=nlevels;&lt;/P&gt;&lt;P&gt;&amp;nbsp; tables _all_ / noprint ;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;data&lt;/STRONG&gt; want ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if &lt;STRONG&gt;0&lt;/STRONG&gt; then set mydata (drop=_all_) nobs=nobs ;&lt;/P&gt;&lt;P&gt;&amp;nbsp; set nlevels;&lt;/P&gt;&lt;P&gt;&amp;nbsp; total=nobs;&lt;/P&gt;&lt;P&gt;&amp;nbsp; unique_ratio = nlevels/total ;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;proc&lt;/STRONG&gt; &lt;STRONG&gt;print&lt;/STRONG&gt;;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;run&lt;/STRONG&gt;;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 04 Dec 2018 01:17:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-NLEVELS-with-very-large-datasets-insufficient-memory/m-p/518284#M140256</guid>
      <dc:creator>Marpole</dc:creator>
      <dc:date>2018-12-04T01:17:26Z</dc:date>
    </item>
    <item>
      <title>Re: Using NLEVELS with very large datasets: insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-NLEVELS-with-very-large-datasets-insufficient-memory/m-p/518289#M140259</link>
      <description>&lt;P&gt;1. You could increase the memory allocated to SAS.&lt;/P&gt;
&lt;P&gt;2. You could reduce the number of variables analysed in one go, for example&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;tables _NUMERIC_&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;then&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;&lt;SPAN&gt;tables _CHAR_&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;instead of&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;FONT face="courier new,courier"&gt;&lt;SPAN&gt;tables _ALL_&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;3. You could&amp;nbsp;remove obviously high-cardinality&amp;nbsp;variables. For example BANK_ACCT_BALANCE will obviously&amp;nbsp;have a very high NLEVELS whose calculation brings no value.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;4. proc sql allows you to calculate&amp;nbsp;&lt;SPAN&gt;number of levels, missing levels, non-missing levels, number of observations, and the ratio of levels to observations in one step. You need to explicitly&amp;nbsp;code the variable names though, and the memory issue remains. But at least you only have one pass per variable.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 04 Dec 2018 20:54:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-NLEVELS-with-very-large-datasets-insufficient-memory/m-p/518289#M140259</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2018-12-04T20:54:46Z</dc:date>
    </item>
    <item>
      <title>Re: Using NLEVELS with very large datasets: insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-NLEVELS-with-very-large-datasets-insufficient-memory/m-p/518299#M140264</link>
      <description>And make sure to suppress any displayed output.</description>
      <pubDate>Tue, 04 Dec 2018 04:16:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-NLEVELS-with-very-large-datasets-insufficient-memory/m-p/518299#M140264</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-12-04T04:16:40Z</dc:date>
    </item>
  </channel>
</rss>

