<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic HASH Issue in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354025#M82738</link>
    <description>&lt;P&gt;Dear SAS Users:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt; I have one dataset x, has n=4,684,804 and 1&amp;nbsp;variable Receipt with Length $13..&lt;/P&gt;
&lt;P&gt;I have another dataset y, has n=1.5404E8 records with 37 variables. Receipt has Minimum Length=3, and Maximum Length=13.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using HASH technique&amp;nbsp;two merge x and y on Receipt column.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;DATA WORK.FILE_CDIM00;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; IF 0 THEN SET X;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; if _N_ = 1 then do;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; declare hash HASH_NAME(dataset: "Y", multidata: 'y');&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; HASH_NAME.defineKEY("RECEIPT");&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; HASH_NAME.defineData (ALL:'YES');&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; HASH_NAME.defineDone();&lt;BR /&gt; END;&lt;/P&gt;
&lt;P&gt;set Y(where= (13&amp;lt;=length(RECEIPT)&amp;lt;=13) ) ;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; IF HASH_NAME.FIND(KEY:RECEIPT) = 0 THEN OUTPUT;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; RUN;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am getting Error as Shown below: &lt;BR /&gt; &lt;FONT color="#FF6600"&gt;WARNING: Multiple lengths were specified for the variable&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;RECEIPT&amp;nbsp;by input data set(s). This can cause truncation of data.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt; ERROR: Hash object added 2293744 items when memory failure occurred.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt; FATAL: Insufficient memory to execute DATA step program. Aborted during the EXECUTION phase.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#FF6600"&gt;Any suggestions , idea, clue ?&lt;/FONT&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 27 Apr 2017 09:45:51 GMT</pubDate>
    <dc:creator>GPatel</dc:creator>
    <dc:date>2017-04-27T09:45:51Z</dc:date>
    <item>
      <title>HASH Issue</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354025#M82738</link>
      <description>&lt;P&gt;Dear SAS Users:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt; I have one dataset x, has n=4,684,804 and 1&amp;nbsp;variable Receipt with Length $13..&lt;/P&gt;
&lt;P&gt;I have another dataset y, has n=1.5404E8 records with 37 variables. Receipt has Minimum Length=3, and Maximum Length=13.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am using HASH technique&amp;nbsp;two merge x and y on Receipt column.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;DATA WORK.FILE_CDIM00;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; IF 0 THEN SET X;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; if _N_ = 1 then do;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; declare hash HASH_NAME(dataset: "Y", multidata: 'y');&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; HASH_NAME.defineKEY("RECEIPT");&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; HASH_NAME.defineData (ALL:'YES');&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; HASH_NAME.defineDone();&lt;BR /&gt; END;&lt;/P&gt;
&lt;P&gt;set Y(where= (13&amp;lt;=length(RECEIPT)&amp;lt;=13) ) ;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; IF HASH_NAME.FIND(KEY:RECEIPT) = 0 THEN OUTPUT;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;BR /&gt; RUN;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am getting Error as Shown below: &lt;BR /&gt; &lt;FONT color="#FF6600"&gt;WARNING: Multiple lengths were specified for the variable&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;RECEIPT&amp;nbsp;by input data set(s). This can cause truncation of data.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt; ERROR: Hash object added 2293744 items when memory failure occurred.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt; FATAL: Insufficient memory to execute DATA step program. Aborted during the EXECUTION phase.&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#FF6600"&gt;Any suggestions , idea, clue ?&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Apr 2017 09:45:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354025#M82738</guid>
      <dc:creator>GPatel</dc:creator>
      <dc:date>2017-04-27T09:45:51Z</dc:date>
    </item>
    <item>
      <title>Re: HASH Issue</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354033#M82742</link>
      <description>&lt;P&gt;What does the data structure look like (this is why posting test data is always a good idea). &amp;nbsp;As from your text you are almost saying that Receipt is length 13 in both cases - however the Warning is telling you this is not the case. &amp;nbsp;Make sure you fix the length in your datasets, no good having a structure which keeps changing as it can cause you all kinds of problems.&lt;/P&gt;
&lt;P&gt;On your Has Error, thats a large amount of data to be putting into memory, and you seem to have run out. &amp;nbsp;Is there a reason you need to use hash for what seems like a simple data merge task?&lt;/P&gt;
&lt;PRE&gt;data file_cdim00;
  merge x (in=a) y (in=b);
  by receipt;
  if b then output;
run;&lt;/PRE&gt;
&lt;P&gt;Or you might be able to let SQL do it by:&lt;/P&gt;
&lt;PRE&gt;proc sql;
  create table FILE_CDIM00 as
  select  *
  from    X
  where  RECEIPT in (select RECEIPT from Y);
quit;&lt;/PRE&gt;
&lt;P&gt;It will of course take a long time to run, with just under 5mil rows you can expect that. &amp;nbsp;It may be that by working differently however you could avoid this totally, maybe you could use ranges of receipts rather than lists (and I can't see your data here), somethin glike:&lt;BR /&gt;receipt&lt;/P&gt;
&lt;P&gt;000000001&lt;/P&gt;
&lt;P&gt;000000010&lt;/P&gt;
&lt;P&gt;...&lt;/P&gt;
&lt;P&gt;range=1 to 10, so if receipt between input(min(receipt,best.)) and input(max(receipt),best.) then output.&lt;/P&gt;</description>
      <pubDate>Thu, 27 Apr 2017 10:22:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354033#M82742</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2017-04-27T10:22:00Z</dc:date>
    </item>
    <item>
      <title>Re: HASH Issue</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354034#M82743</link>
      <description>&lt;P&gt;Don't use a hash for that dataset, it is simply too large.&lt;/P&gt;
&lt;P&gt;Use the common technique of sorting and merging.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The way I read your code, you are reading Y (with all variables) into the hash, and then merge it with itself. Didn't you want to read X into the hash, for lookup?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Before the operation, you should make sure that your datasets have identical attributes for variable receipt (type character, length 13). That will prevent the WARNING.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And your condition&lt;/P&gt;
&lt;PRE&gt;where= (13&amp;lt;=length(RECEIPT)&amp;lt;=13)&lt;/PRE&gt;
&lt;P&gt;is equal to&lt;/P&gt;
&lt;PRE&gt;where= (length(RECEIPT)=13)&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Apr 2017 10:24:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354034#M82743</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2017-04-27T10:24:16Z</dc:date>
    </item>
    <item>
      <title>Re: HASH Issue</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354099#M82775</link>
      <description>&lt;PRE&gt;
Try 

IF 0 THEN SET X; 

--&amp;gt;
IF 0 THEN SET Y; 
IF 0 THEN SET X; 

&lt;/PRE&gt;</description>
      <pubDate>Thu, 27 Apr 2017 13:11:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354099#M82775</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-04-27T13:11:23Z</dc:date>
    </item>
    <item>
      <title>Re: HASH Issue</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354105#M82778</link>
      <description>Not clear, Sharp. &lt;BR /&gt; Can you please assist.</description>
      <pubDate>Thu, 27 Apr 2017 13:27:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354105#M82778</guid>
      <dc:creator>GPatel</dc:creator>
      <dc:date>2017-04-27T13:27:38Z</dc:date>
    </item>
    <item>
      <title>Re: HASH Issue</title>
      <link>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354110#M82780</link>
      <description>&lt;P&gt;If I was right , the length of&amp;nbsp;&lt;SPAN&gt;RECEIPT in Y is longer than in X.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;So either you define it by hand&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;length&amp;nbsp;RECEIPT $ 200;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;or make&amp;nbsp;RECEIPT of Y initialize before X.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;if 0 then set Y;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;if 0 then set X;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;and your memory is too small to hold the data from your LOG.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 27 Apr 2017 13:45:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/HASH-Issue/m-p/354110#M82780</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-04-27T13:45:53Z</dc:date>
    </item>
  </channel>
</rss>

