<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Using Hash object to compare fields and create new variables runs insufficient memory in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918247#M361707</link>
    <description>&lt;P&gt;Hi all,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I used the hash object as guided in &lt;A href="https://communities.sas.com/t5/SAS-Programming/How-to-construct-a-new-variable-by-comparing-fields-in-two-data/m-p/844447#M333840" target="_self"&gt;my previous query&lt;/A&gt; to create over 1000 &lt;A href="https://hcup-us.ahrq.gov/toolssoftware/ccsr/dxccsr.jsp" target="_self"&gt;clinical classifications software diagnosis&lt;/A&gt; and &lt;A href="https://hcup-us.ahrq.gov/toolssoftware/ccsr/prccsr.jsp#download" target="_self"&gt;procedure&lt;/A&gt; groups for the inpatient stay data. There are over 5 million observations (stays) in the stay file (stayfile) and over 30 million observations (claims) in the claim file (clmfile).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I tried&amp;nbsp;"set .... POINT= " statement as guided &lt;A href="https://communities.sas.com/t5/SAS-Programming/Hash-object-with-Insufficient-memory-to-execute-data-step/td-p/440515" target="_self"&gt;here&lt;/A&gt;, but still ran into insufficient memory. Below is my code.&lt;/P&gt;
&lt;P&gt;data stayfile_id;&lt;BR /&gt;set stayfile;&lt;BR /&gt;stay_id=cat(Person_ID,"_",put(stay_from_dt,date9.));&lt;BR /&gt;keep stay_id;&lt;BR /&gt;run;&lt;BR /&gt;data rid /view=rid; &lt;BR /&gt;set stayfile_id (keep=stay_id );&lt;BR /&gt;rid=_n_;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;data want(drop = clm_beg_dt clm_end_dt &amp;amp;ccsr_dx_proc_var.);&lt;/P&gt;
&lt;P&gt;if _N_ = 1 then do;&lt;BR /&gt;dcl hash h(dataset : 'clmfile', multidata : 'Y');&lt;BR /&gt;h.definekey('Person_ID');&lt;BR /&gt;h.definedata(all : 'Y');&lt;BR /&gt;h.definedone();&lt;BR /&gt;end;&lt;/P&gt;
&lt;P&gt;set stayfile point=rid;&lt;/P&gt;
&lt;P&gt;if 0 then set clmfile ;&lt;BR /&gt;call missing(clm_beg_dt, clm_end_dt, &amp;amp;ccsr_dx_proc_var_comma.);&lt;BR /&gt;new_DXCCSR_BLD001=0;&lt;BR /&gt;new_DXCCSR_BLD002=0;&lt;BR /&gt;/*over 1000 similar equations here*/&lt;/P&gt;
&lt;P&gt;do while (h.do_over() = 0);&lt;BR /&gt;if clm_beg_dt &amp;gt;= Stay_from_dt and clm_end_dt &amp;lt;= Stay_Thru_dt then do;&lt;BR /&gt;new_DXCCSR_BLD001=DXCCSR_BLD001;&lt;BR /&gt;new_DXCCSR_BLD002=DXCCSR_BLD002;&lt;BR /&gt;/*over 1000 similar equations here*/&lt;BR /&gt;end;&lt;BR /&gt;end;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can anyone guide how to revise the code to deal with the memory problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much!&lt;/P&gt;
&lt;P&gt;L.&lt;/P&gt;</description>
    <pubDate>Wed, 28 Feb 2024 16:00:44 GMT</pubDate>
    <dc:creator>lichee</dc:creator>
    <dc:date>2024-02-28T16:00:44Z</dc:date>
    <item>
      <title>Using Hash object to compare fields and create new variables runs insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918247#M361707</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I used the hash object as guided in &lt;A href="https://communities.sas.com/t5/SAS-Programming/How-to-construct-a-new-variable-by-comparing-fields-in-two-data/m-p/844447#M333840" target="_self"&gt;my previous query&lt;/A&gt; to create over 1000 &lt;A href="https://hcup-us.ahrq.gov/toolssoftware/ccsr/dxccsr.jsp" target="_self"&gt;clinical classifications software diagnosis&lt;/A&gt; and &lt;A href="https://hcup-us.ahrq.gov/toolssoftware/ccsr/prccsr.jsp#download" target="_self"&gt;procedure&lt;/A&gt; groups for the inpatient stay data. There are over 5 million observations (stays) in the stay file (stayfile) and over 30 million observations (claims) in the claim file (clmfile).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I tried&amp;nbsp;"set .... POINT= " statement as guided &lt;A href="https://communities.sas.com/t5/SAS-Programming/Hash-object-with-Insufficient-memory-to-execute-data-step/td-p/440515" target="_self"&gt;here&lt;/A&gt;, but still ran into insufficient memory. Below is my code.&lt;/P&gt;
&lt;P&gt;data stayfile_id;&lt;BR /&gt;set stayfile;&lt;BR /&gt;stay_id=cat(Person_ID,"_",put(stay_from_dt,date9.));&lt;BR /&gt;keep stay_id;&lt;BR /&gt;run;&lt;BR /&gt;data rid /view=rid; &lt;BR /&gt;set stayfile_id (keep=stay_id );&lt;BR /&gt;rid=_n_;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;data want(drop = clm_beg_dt clm_end_dt &amp;amp;ccsr_dx_proc_var.);&lt;/P&gt;
&lt;P&gt;if _N_ = 1 then do;&lt;BR /&gt;dcl hash h(dataset : 'clmfile', multidata : 'Y');&lt;BR /&gt;h.definekey('Person_ID');&lt;BR /&gt;h.definedata(all : 'Y');&lt;BR /&gt;h.definedone();&lt;BR /&gt;end;&lt;/P&gt;
&lt;P&gt;set stayfile point=rid;&lt;/P&gt;
&lt;P&gt;if 0 then set clmfile ;&lt;BR /&gt;call missing(clm_beg_dt, clm_end_dt, &amp;amp;ccsr_dx_proc_var_comma.);&lt;BR /&gt;new_DXCCSR_BLD001=0;&lt;BR /&gt;new_DXCCSR_BLD002=0;&lt;BR /&gt;/*over 1000 similar equations here*/&lt;/P&gt;
&lt;P&gt;do while (h.do_over() = 0);&lt;BR /&gt;if clm_beg_dt &amp;gt;= Stay_from_dt and clm_end_dt &amp;lt;= Stay_Thru_dt then do;&lt;BR /&gt;new_DXCCSR_BLD001=DXCCSR_BLD001;&lt;BR /&gt;new_DXCCSR_BLD002=DXCCSR_BLD002;&lt;BR /&gt;/*over 1000 similar equations here*/&lt;BR /&gt;end;&lt;BR /&gt;end;&lt;/P&gt;
&lt;P&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Can anyone guide how to revise the code to deal with the memory problem?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you so much!&lt;/P&gt;
&lt;P&gt;L.&lt;/P&gt;</description>
      <pubDate>Wed, 28 Feb 2024 16:00:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918247#M361707</guid>
      <dc:creator>lichee</dc:creator>
      <dc:date>2024-02-28T16:00:44Z</dc:date>
    </item>
    <item>
      <title>Re: Using Hash object to compare fields and create new variables runs insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918250#M361709</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/430334"&gt;@lichee&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Couple of changes that could help you with the memory issue&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Use an explicit &lt;STRONG&gt;-memsize xG&lt;/STRONG&gt; (x: number) SAS invocation option to specify how much memory the SAS process has access to. On Linux/Windows the default is 2G. You can check your SAS session's setting by running the following&amp;nbsp;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;Proc options option=memsize; run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;/LI&gt;
&lt;LI&gt;Explicitly specify the &lt;A title="HashExp" href="https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/ds2ref/n03n4ipuouac35n136fdcrccdao9.htm" target="_blank" rel="noopener"&gt;HashExp&lt;/A&gt; value in your Hash object declaration. Default: 8, Max: 16&amp;nbsp;&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;Hope that helps,&lt;/P&gt;
&lt;P&gt;Ahmed&lt;/P&gt;</description>
      <pubDate>Wed, 28 Feb 2024 16:24:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918250#M361709</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2024-02-28T16:24:16Z</dc:date>
    </item>
    <item>
      <title>Re: Using Hash object to compare fields and create new variables runs insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918255#M361711</link>
      <description>&lt;P&gt;What is MEMSIZE currently?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Did you try increasing the -MEMSIZE 8G&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 28 Feb 2024 16:29:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918255#M361711</guid>
      <dc:creator>data_null__</dc:creator>
      <dc:date>2024-02-28T16:29:01Z</dc:date>
    </item>
    <item>
      <title>Re: Using Hash object to compare fields and create new variables runs insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918259#M361712</link>
      <description>Thank you both!&lt;BR /&gt;&lt;BR /&gt;I just ran Proc options option=memsize; run; as Ahmed suggested.&lt;BR /&gt;&lt;BR /&gt; MEMSIZE=8589934592</description>
      <pubDate>Wed, 28 Feb 2024 16:37:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918259#M361712</guid>
      <dc:creator>lichee</dc:creator>
      <dc:date>2024-02-28T16:37:03Z</dc:date>
    </item>
    <item>
      <title>Re: Using Hash object to compare fields and create new variables runs insufficient memory</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918266#M361713</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/430334"&gt;@lichee&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So, Your SAS session has access to a maximum of 8GB as your -memsize value indicates.&lt;/P&gt;
&lt;P&gt;Why are you loading the 30 Million records into the Hash along with every variable in the data set?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;dcl hash h(dataset : 'clmfile', multidata : 'Y');
h.definekey('Person_ID');
&lt;STRONG&gt;h.definedata(all : 'Y');&lt;/STRONG&gt;
h.definedone();&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Try to load the smaller data set into the Hash and loop through the records of your large data set (clmfile)&lt;/P&gt;
&lt;P&gt;If you want to load the large data set into Hash, then use the technique listed on &lt;STRONG&gt;page 4&lt;/STRONG&gt; from this paper&amp;nbsp;&lt;A href="https://www.lexjansen.com/nesug/nesug11/ld/ld01.pdf" target="_blank"&gt;https://www.lexjansen.com/nesug/nesug11/ld/ld01.pdf&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"&lt;EM&gt;Now imagine that a real-world file LOOKUP is so large that memory shortage would prevent the hash table from being loaded with the SAT variables alongside KEY, yet we still want to use the hash object for KEY look-up! The workaround, as noted above, is to leave the SAT variables in their original place on disk and instead, load a file record identifier variable RID into the data portion of the hash table H:&lt;/EM&gt; "&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 28 Feb 2024 16:53:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Using-Hash-object-to-compare-fields-and-create-new-variables/m-p/918266#M361713</guid>
      <dc:creator>AhmedAl_Attar</dc:creator>
      <dc:date>2024-02-28T16:53:06Z</dc:date>
    </item>
  </channel>
</rss>

