<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why my Temp Array unduplication utterly fails vs HASH in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601283#M173891</link>
    <description>&lt;P&gt;&lt;EM&gt;&amp;gt; am i to believe and learn ARRAY IN operator search is useless?&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt; said, "the search in the array is sequential, while the hash find uses a b-tree".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For each new account (there are 200,000), you are sequentially searching through 10 million array values.&lt;/P&gt;
&lt;P&gt;That's 2e5 * 1e7 = 2e12 searches. Two thousand billion searches. 2,000,000,000,000 sequential searches.&lt;/P&gt;
&lt;P&gt;Plus other searches for the other 33,000,000 records where a key is found earlier without scanning the whole array.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How do you expect this to be fast?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And even if magnitude wasn't an issue (it is), how could a sequential search be compared to an indexed hash search?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So useless? Absolutely not. The in operator for an array has its place. But tools are only good when used within their specifications. A F1 car is useless off-road and a buggy is useless on the tarmac. Likewise, expecting to search an array the way you are trying can only fail.&lt;/P&gt;</description>
    <pubDate>Mon, 04 Nov 2019 02:10:09 GMT</pubDate>
    <dc:creator>ChrisNZ</dc:creator>
    <dc:date>2019-11-04T02:10:09Z</dc:date>
    <item>
      <title>Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601264#M173878</link>
      <description>&lt;P&gt;Why my Temp Array unduplication utterly fails vs HASH?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Objective : To get unique account_numbers&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Source data: 33 Million records&lt;/STRONG&gt;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;

/*Runs forever, not completing at all :( */
data w;
 do until(z);
  set aqr.rp_portfolio_&amp;amp;min-aqr.rp_portfolio_&amp;amp;max 
  (keep=account_number portfolio cra_flag period id_investor)
  end=z;
  where period&amp;gt;= 1439 and id_investor not in ('014', '015', '091', '092') or
  period&amp;lt;1439;
  array t(9999999) _temporary_;
  length port_name $30;
  if account_number in t then continue;
  select;
   when (portfolio='Mortgage') port_name='Treasury';
   when (portfolio='Resi' and cra_flag='Y') port_name='CRA';
   when (portfolio in('Resi','HELOAN Low/No') and cra_flag='N') port_name='Resi (No CRA)';
   when (portfolio='Specialty') port_name=portfolio;
   otherwise port_name=' ';
  end;
  if port_name&amp;gt;' ' then do;
   n+1;
   t(n)=account_number;
   output;
  end;
 end;
 stop;
 keep account_number port_name;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What's wrong with me or my eyes that isn't spotting something very obvious in the previous, whereas the &lt;STRONG&gt;HASH equivalent below&lt;/STRONG&gt; &lt;STRONG&gt;runs in&amp;nbsp; real time 6:49.53&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;user cpu time 1:26.36&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;system cpu time 24.94 seconds&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;memory 33221.57k&lt;/STRONG&gt;&lt;BR /&gt;&lt;STRONG&gt;OS Memory 58464.00k&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data w;
 dcl hash H () ;
 h.definekey  ("account_number") ;
 h.definedata ("account_number") ;
 h.definedone () ;
 do until(z);
  set aqr.rp_portfolio_&amp;amp;min-aqr.rp_portfolio_&amp;amp;max 
  (keep=account_number portfolio cra_flag period id_investor)
  end=z;
  where period&amp;gt;= 1439 and id_investor not in ('014', '015', '091', '092') or
  period&amp;lt;1439;
  length port_name $30;
  if h.check()=0 then continue;
  select;
   when (portfolio='Mortgage') port_name='Treasury';
   when (portfolio='Resi' and cra_flag='Y') port_name='CRA';
   when (portfolio in('Resi','HELOAN Low/No') and cra_flag='N') port_name='Resi (No CRA)';
   when (portfolio='Specialty') port_name=portfolio;
   otherwise port_name=' ';
  end;
  if port_name&amp;gt;' ' then do;
   h.add();
   output;
  end;
 end;
 stop;
 keep account_number port_name;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Any thoughts?&lt;/P&gt;</description>
      <pubDate>Sun, 03 Nov 2019 22:51:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601264#M173878</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-03T22:51:17Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601266#M173880</link>
      <description>&lt;P&gt;You could have more than 10 million unique accounts.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, the search in the array is sequential, while the hash find uses a b-tree or something similar performant.&lt;/P&gt;</description>
      <pubDate>Sun, 03 Nov 2019 22:55:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601266#M173880</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2019-11-03T22:55:50Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601268#M173882</link>
      <description>&lt;P&gt;Sir &lt;EM&gt;"&lt;/EM&gt;&lt;SPAN&gt;&lt;EM&gt;You could have more than 10 million unique accounts"&lt;/EM&gt;&amp;nbsp;-This is funny though, we wish we were that big to have a huge portfloio. However,&amp;nbsp; for a regional bank, our portfolio size isn't too bad either. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; Nonetheless, that did make me think more. I am taking 82 months data min-max(range) as you would have noticed in the SET statement. The HASH successful&amp;nbsp;output has given me 190,000 which is right.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Hmm, so am i to believe and&amp;nbsp; learn ARRAY IN operator search is useless?? Oh gosh!&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 03 Nov 2019 23:18:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601268#M173882</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-03T23:18:10Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601271#M173884</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Are your account number integers in the range&amp;nbsp; {-constant('bigint'):constant('bigint')?&amp;nbsp; Probably not, since I think you would likely use direct array lookup in such a case.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;But are the account numbers mappable in a 1 to 1 correspondence to such a range?&amp;nbsp; If so, then instead of searching an array you could do direct lookup of the transformed account number.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you tell us the structure of your account numbers?&lt;/P&gt;</description>
      <pubDate>Sun, 03 Nov 2019 23:27:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601271#M173884</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2019-11-03T23:27:32Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601274#M173885</link>
      <description>&lt;P&gt;Looks something like this&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;1829647&lt;BR /&gt;7329386&lt;BR /&gt;7339930&lt;BR /&gt;9515503&lt;BR /&gt;10005882&lt;BR /&gt;10039832&lt;BR /&gt;10053965&lt;BR /&gt;10065076&lt;BR /&gt;10068369&lt;BR /&gt;10073393&lt;/P&gt;
&lt;P&gt;5777&lt;/P&gt;
&lt;P&gt;70396902245&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And all are stored as numeric.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 03 Nov 2019 23:33:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601274#M173885</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-03T23:33:53Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601276#M173886</link>
      <description>One item to note:&lt;BR /&gt;&lt;BR /&gt;The (keep= ) list applies to the last data set only.  For all the other source data sets, the program reads in all variables.  At least that's what it looks like.</description>
      <pubDate>Mon, 04 Nov 2019 00:45:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601276#M173886</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2019-11-04T00:45:32Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601277#M173887</link>
      <description>&lt;P&gt;Sir&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/4954"&gt;@Astounding&lt;/a&gt;&amp;nbsp; You nearly gave me a heart attack . Luckily I checked the doc again before I would have jumped out of the building which the ARRAY IN is making me contemplate that. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp;&lt;/P&gt;
&lt;H4 class="xis-argument"&gt;(&lt;SPAN class="xis-userSuppliedValue"&gt;data-set-options&lt;/SPAN&gt;)&lt;/H4&gt;
&lt;DIV class="xis-argumentDescription"&gt;
&lt;P class="xis-paraSimpleFirst"&gt;specifies actions SAS is to take when it reads variables or observations into the program data vector for processing.&lt;/P&gt;
&lt;P class="xis-paraSimpleFirst"&gt;&lt;A href="https://documentation.sas.com/?docsetId=lestmtsref&amp;amp;docsetTarget=p00hxg3x8lwivcn1f0e9axziw57y.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en"&gt;https://documentation.sas.com/?docsetId=lestmtsref&amp;amp;docsetTarget=p00hxg3x8lwivcn1f0e9axziw57y.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en&lt;/A&gt;&lt;/P&gt;
&lt;TABLE class="xis-summary"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="xis-summaryTip"&gt;Tip&lt;/TD&gt;
&lt;TD class="xis-summaryText"&gt;&lt;FONT color="#339966"&gt;&lt;STRONG&gt;Data set options that apply to a data set list apply to all of the data sets in the list.&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="xis-summarySee"&gt;&lt;FONT color="#339966"&gt;&lt;STRONG&gt;See&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/TD&gt;
&lt;TD class="xis-summaryText"&gt;&lt;SPAN class="xis-xrefSee"&gt;&lt;SPAN class="xis-xrefText"&gt;For more information, see&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A class="ng-scope" tabindex="0" title="" href="https://documentation.sas.com/?docsetId=ledsoptsref&amp;amp;docsetTarget=n1tgswl0rz314fn1iac4vkbw86g0.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" data-docset-id="ledsoptsref" data-docset-version="9.4" data-original-href="n1tgswl0rz314fn1iac4vkbw86g0.htm"&gt;Definition of Data Set Options&lt;/A&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;in&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;I&gt;SAS Data Set Options: Reference&lt;/I&gt;&lt;SPAN class="xis-xrefText"&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;for a list of the data set options to use with input data sets.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;</description>
      <pubDate>Mon, 04 Nov 2019 00:56:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601277#M173887</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-04T00:56:34Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601278#M173888</link>
      <description>OK, sorry about that.  The last time I saw that part of the documentation, data set lists didn't exist.  I promise to take a second look and see what I can find.</description>
      <pubDate>Mon, 04 Nov 2019 01:08:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601278#M173888</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2019-11-04T01:08:33Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601280#M173889</link>
      <description>&lt;P&gt;I guess your tables are not sorted or indexed by account number?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 01:44:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601280#M173889</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2019-11-04T01:44:54Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601282#M173890</link>
      <description>&lt;P&gt;Sir&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/462"&gt;@PGStats&lt;/a&gt;&amp;nbsp; &amp;nbsp;True. This is for that very purpose indeed. The simple HAVE and WANT below demo works exactly as intended, but when reading company data, it isn't. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For example:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data have;
 input acctno;
 cards;
 8
 3
 8
 3
 10
 3
 6
 5
 3
 ;

data want;
 do until(z);
  set have end=z;
  array t(9) _temporary_;
  if acctno in t then continue;
  n+1;
  t(n)=acctno;
  output;
 end;
 drop n;
run;

proc print noobs;run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.WANT" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;&lt;COLGROUP&gt; &lt;COL /&gt;&lt;/COLGROUP&gt;
&lt;THEAD&gt;
&lt;TR&gt;
&lt;TH class="r header" scope="col"&gt;acctno&lt;/TH&gt;
&lt;/TR&gt;
&lt;/THEAD&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;8&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;3&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;10&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;6&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="r data"&gt;5&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Mon, 04 Nov 2019 01:58:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601282#M173890</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-04T01:58:34Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601283#M173891</link>
      <description>&lt;P&gt;&lt;EM&gt;&amp;gt; am i to believe and learn ARRAY IN operator search is useless?&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt; said, "the search in the array is sequential, while the hash find uses a b-tree".&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For each new account (there are 200,000), you are sequentially searching through 10 million array values.&lt;/P&gt;
&lt;P&gt;That's 2e5 * 1e7 = 2e12 searches. Two thousand billion searches. 2,000,000,000,000 sequential searches.&lt;/P&gt;
&lt;P&gt;Plus other searches for the other 33,000,000 records where a key is found earlier without scanning the whole array.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How do you expect this to be fast?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And even if magnitude wasn't an issue (it is), how could a sequential search be compared to an indexed hash search?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So useless? Absolutely not. The in operator for an array has its place. But tools are only good when used within their specifications. A F1 car is useless off-road and a buggy is useless on the tarmac. Likewise, expecting to search an array the way you are trying can only fail.&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 02:10:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601283#M173891</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-11-04T02:10:09Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601284#M173892</link>
      <description>&lt;P&gt;That's very disappointing to know it's sequential.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Does that mean WHICHC,WHICHN all will also be &lt;STRIKE&gt;useless&lt;/STRIKE&gt; too? Do these also do follow suit (i.e sequential search)?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 02:18:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601284#M173892</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-04T02:18:45Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601287#M173893</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/138205"&gt;@novinosrin&lt;/a&gt;&amp;nbsp;&amp;nbsp;Don't blame the tool. All tools have specifications and limitations. The key is to know these and use the tools in an optimal manner.&lt;/P&gt;
&lt;P&gt;You can also &lt;A href="https://communities.sas.com/t5/SASware-Ballot-Ideas/idb-p/sas_ideas" target="_self"&gt;suggest improvements&lt;/A&gt;, but that's an other matter (and doesn't preclude knowing the specifications and limitations).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 02:41:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601287#M173893</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-11-04T02:41:12Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601288#M173894</link>
      <description>&lt;P&gt;Which begs the question: Why create a 10-million-values array to store 200,000 values?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 22:51:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601288#M173894</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2019-11-04T22:51:48Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601290#M173895</link>
      <description>&lt;P&gt;Agree.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 02:49:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601290#M173895</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2019-11-04T02:49:01Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601295#M173900</link>
      <description>&lt;P&gt;OK, here's an approach that might be feasible, might not.&amp;nbsp; The issues .....&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It treats account number as numeric.&amp;nbsp; While you can easily work around that, the cost for that is not clear.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It utilizes the direct array lookup that&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/31461"&gt;@mkeintz&lt;/a&gt;&amp;nbsp;mentioned.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It needs a lot more memory.&amp;nbsp; Whether you can get that much is not clear ... on the order of 400 times what you are using now for the temporary array.&amp;nbsp; The array expands to 11 digits, since the largest account number you illustrated is 11 digits long.&amp;nbsp; The memory requirements might be shrinkable, since it appears that you don't really need a length of $30 for PORT_NAME.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I took out the DO loop, since I didn't see a good purpose for keeping it.&amp;nbsp; Perhaps its purpose is to maintain a top-to-bottom structure for the DATA step logic.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If all this is acceptable, it should be lightning fast at loading the values, and should take a bit more time to unload them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data w;
  array t {99999999999} $ 30 _temporary_;
  if z then do account_number = 1 to 99999999999;
     if t{account_number} &amp;gt; ' ' then do;
        port_name = t{account_number};
        output;
     end;
  end;   
  set aqr.rp_portfolio_&amp;amp;min-aqr.rp_portfolio_&amp;amp;max 
  (keep=account_number portfolio cra_flag period id_investor)
  end=z;
  where period&amp;gt;= 1439 and id_investor not in ('014', '015', '091', '092') or
  period&amp;lt;1439;
  if t{account_number} &amp;gt; ' ' then return;
  select;
   when (portfolio='Mortgage') t{account_number}='Treasury';
   when (portfolio='Resi' and cra_flag='Y') t{account_number}='CRA';
   when (portfolio in('Resi','HELOAN Low/No') and cra_flag='N') 
         t{account_number}='Resi (No CRA)';
   when (portfolio='Specialty') t{account_number}='Specialty';
   otherwise; &lt;BR /&gt;  end;
 keep account_number port_name;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 04 Nov 2019 03:33:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601295#M173900</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2019-11-04T03:33:22Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601296#M173901</link>
      <description>&lt;P&gt;Is there some reason for using a cryptic test for missing character values?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;charvar &amp;gt; ' '&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Why not use something that is clearer?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;not missing(charvar)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;charvar ne ' '&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Note that the cyptic test will fail if the string is not empty but starts with a character that is before space in the ASCII sequence. Like a TAB.&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 03:36:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601296#M173901</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-11-04T03:36:40Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601298#M173903</link>
      <description>&lt;P&gt;No reason, that's just how I've always done it.&amp;nbsp; As a general rule, calling a function takes longer, but I haven't tested for this case.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, this logic looks a little too complex:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if t{account_number} &amp;gt; ' ' then return;
  select;
   when (portfolio='Mortgage') t{account_number}='Treasury';
   when (portfolio='Resi' and cra_flag='Y') t{account_number}='CRA';
   when (portfolio in('Resi','HELOAN Low/No') and cra_flag='N') 
         t{account_number}='Resi (No CRA)';
   when (portfolio='Specialty') t{account_number}='Specialty';
   otherwise;
   end;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;It should probably be simplified to remove the RETURN statement:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if t{account_number} = ' ' then select;
   when (portfolio='Mortgage') t{account_number}='Treasury';
   when (portfolio='Resi' and cra_flag='Y') t{account_number}='CRA';
   when (portfolio in('Resi','HELOAN Low/No') and cra_flag='N') 
         t{account_number}='Resi (No CRA)';
   when (portfolio='Specialty') t{account_number}='Specialty';
   otherwise;  
end;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Is that the spot in the program you are referring to?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 15:57:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601298#M173903</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2019-11-04T15:57:41Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601299#M173904</link>
      <description>&lt;P&gt;I was just asking if there was some special reason why that test would be better than the normal check for non blank value. Note using &amp;gt; will fail if the string starts with a character that is before space in the ASCII coding sequence.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note I have also seen other posts were someone used&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;numvar &amp;gt; .&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;As if it meant not missing.&amp;nbsp; Again that test will fail for special missing since .A to .Z are all large than normal missing value.&amp;nbsp; .Z is the largest missing value.&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 03:45:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601299#M173904</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-11-04T03:45:21Z</dc:date>
    </item>
    <item>
      <title>Re: Why my Temp Array unduplication utterly fails vs HASH</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601300#M173905</link>
      <description>&lt;P&gt;I guess whatever is fasted is best ... in this particular case.&amp;nbsp; The program itself assigns all those character values, so I'm not worried about strange values that might appear in the data.&amp;nbsp; But I certainly have been guilty of that in some programs that I have written.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It reminds me of discussions about the range low - high for user-defined character formats.&amp;nbsp; There are definitely cases where you need to be aware of values lower than a blank.&lt;/P&gt;</description>
      <pubDate>Mon, 04 Nov 2019 03:46:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Why-my-Temp-Array-unduplication-utterly-fails-vs-HASH/m-p/601300#M173905</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2019-11-04T03:46:37Z</dc:date>
    </item>
  </channel>
</rss>

