<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Errors in Key-Indexing Subset Using Array in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637073#M189363</link>
    <description>&lt;P&gt;Okay I found out that what we actually want is not to subset the claims but simply to left join on the ndc key and bring in the other fields from the ndc table.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I wrote a hash left lookup like this, however my output gives me blank values in the entire dataset. Nothing is populated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data TEST(drop=rc:);
  if 0 then set ndc_table;
  if _N_ = 1 then do;
      declare hash h1(dataset:"ndc_table");
      h1.defineKey('NDC');
      h1.defineData(all:'Y');
      h1.defineDone();
  end;
  set claims;
  rc1=h1.find();
  if rc1 ne 0 then call missing(of _all_);
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 02 Apr 2020 20:54:50 GMT</pubDate>
    <dc:creator>PegaZeus</dc:creator>
    <dc:date>2020-04-02T20:54:50Z</dc:date>
    <item>
      <title>Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637025#M189330</link>
      <description>&lt;P&gt;Greetings,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I have two tables, one of them with around 100,000,000 healthcare claims (claims), and one with a list of NDCs (ndc_table). I'm trying to use a key-indexing approach to subset to include only those claims associated with NDCs in the NDC table.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data final; 
  array ndc_list (*) _temporary_; 
  do until (eof1); 
    set ndc_table end=eof1; 
        ndc_list(ndc) = ndc; 
end;
do until (eof2); 
  set claims end=eof2; 
    array_search = ndc_list(ndc); 
    if array_search &amp;gt; . then output; 
end;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I'm getting 3 errors that I can't seem to figure out:&lt;/P&gt;&lt;P&gt;ERROR: The non-variable based array ndc_list has been defined with zero elements.&lt;/P&gt;&lt;P&gt;ERROR: Too many array subscripts specified for array ndc_list.&lt;/P&gt;&lt;P&gt;ERROR: Too many array subscripts specified for array ndc_list.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I used (*) because the number of NDCs in the ndc_table is going to change over time. Right now there are around 3500 values.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 19:06:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637025#M189330</guid>
      <dc:creator>PegaZeus</dc:creator>
      <dc:date>2020-04-02T19:06:18Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637037#M189337</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Arrays in SAS need to be defined with a dimension. In your case you may want to choose a "large enough" value, e.g., 999999 if this is larger than the maximum NDC value you would expect. Or determine the maximum in a preliminary step and use it in the array definition.&amp;nbsp;&lt;A href="https://documentation.sas.com/?docsetId=lrcon&amp;amp;docsetTarget=n1b4cbtmb049xtn1vh9x4waiioz4.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_blank" rel="noopener"&gt;SAS hash objects&lt;/A&gt; do not have this requirement. They grow as needed while they are loaded with data. So this would be an alternative approach.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 19:41:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637037#M189337</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-04-02T19:41:01Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637044#M189341</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt;&amp;nbsp;what is the maximum value of ndc?&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 19:57:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637044#M189341</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2020-04-02T19:57:29Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637051#M189345</link>
      <description>&lt;P&gt;The max value of NDC would be 99999999999. This is not the number of ndcs in the ndc table though. In the ndc table there are about 3500 NDCs.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:13:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637051#M189345</guid>
      <dc:creator>PegaZeus</dc:creator>
      <dc:date>2020-04-02T20:13:44Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637055#M189348</link>
      <description>&lt;P&gt;So NDC can range from 1 to 99999999999? Or does 99999999999 have some special meaning?&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:26:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637055#M189348</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2020-04-02T20:26:46Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637058#M189351</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;The max value of NDC would be 99999999999. This is not the number of ndcs in the ndc table though. In the ndc table there are about 3500 NDCs.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This 11-digit number would likely be too large as an array dimension, so the key-indexing approach would need a modification or perhaps you just resort to the hash object approach shown below:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data final;
dcl hash h(dataset:'ndc_table');
h.definekey('ndc');
h.definedone();
do until(eof);
  set claims end=eof;
  if h.check()=0 then output;
end;
stop;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:39:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637058#M189351</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-04-02T20:39:26Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637060#M189353</link>
      <description>&lt;P&gt;That's correct, except instead of just 1, it would be 00000000001. It has to have 11 digits.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:40:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637060#M189353</guid>
      <dc:creator>PegaZeus</dc:creator>
      <dc:date>2020-04-02T20:40:44Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637061#M189354</link>
      <description>&lt;P&gt;The reason you are getting the error is because you haven't told SAS how many elements the array will have.&amp;nbsp; Normally you don't have to tell SAS that because it can count the variables you listed.&amp;nbsp; But since you are making a temporary array there aren't any variables to count.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Use a HASH object instead.&lt;/P&gt;
&lt;P&gt;NDC codes are not a good domain to use as the index into an array. Especially since they are best represented as TEXT variables so that you an put the hyphens in the right places.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:41:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637061#M189354</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-04-02T20:41:01Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637063#M189355</link>
      <description>&lt;P&gt;Ok. Definitely go with the Hash approach suggested by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;&amp;nbsp;then &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:44:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637063#M189355</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2020-04-02T20:44:34Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637067#M189357</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;&amp;nbsp;out of interest, what modification do you have in mind to approach this problem with key-indexing? Bitmapping?&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:47:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637067#M189357</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2020-04-02T20:47:58Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637073#M189363</link>
      <description>&lt;P&gt;Okay I found out that what we actually want is not to subset the claims but simply to left join on the ndc key and bring in the other fields from the ndc table.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So I wrote a hash left lookup like this, however my output gives me blank values in the entire dataset. Nothing is populated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data TEST(drop=rc:);
  if 0 then set ndc_table;
  if _N_ = 1 then do;
      declare hash h1(dataset:"ndc_table");
      h1.defineKey('NDC');
      h1.defineData(all:'Y');
      h1.defineDone();
  end;
  set claims;
  rc1=h1.find();
  if rc1 ne 0 then call missing(of _all_);
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:54:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637073#M189363</guid>
      <dc:creator>PegaZeus</dc:creator>
      <dc:date>2020-04-02T20:54:50Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637074#M189364</link>
      <description>&lt;P&gt;Oh, nothing concrete yet. I only remembered those Dorfman papers about key-indexing and was thinking that the NDC numbers perhaps could be mapped to a much smaller range of integers (without collisions). But this would, of course, require deeper knowledge about the structure of those NDC numbers.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 20:55:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637074#M189364</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-04-02T20:55:27Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637077#M189367</link>
      <description>When FIND() fails you a setting ALL of the variables to missing, even those the came from the CLAIMS dataset.&lt;BR /&gt;If everything ends up missing then you got no hits.  Are you sure NDC variable is in both datasets? Is the content in the same format?  Perhaps one is using NDC-10 and the other NDC-11.&lt;BR /&gt;</description>
      <pubDate>Thu, 02 Apr 2020 21:04:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637077#M189367</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-04-02T21:04:08Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637078#M189368</link>
      <description>&lt;P&gt;Ok. Thank you. Triggered my curiosity &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt; The Dorfman papers entered my mind as well.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 21:04:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637078#M189368</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2020-04-02T21:04:09Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637090#M189377</link>
      <description>&lt;P&gt;Interesting, you're right. What actually happened was I was using too small of a claims sample for testing, which didn't include any of the NDCs in the ndc_table. I updated the call missing to only the fields in that table now.&lt;/P&gt;</description>
      <pubDate>Thu, 02 Apr 2020 21:32:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637090#M189377</guid>
      <dc:creator>PegaZeus</dc:creator>
      <dc:date>2020-04-02T21:32:37Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in Key-Indexing Subset Using Array</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637117#M189390</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/320380"&gt;@PegaZeus&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Okay I found out that what we actually want is not to subset the claims but simply to left join on the ndc key and bring in the other fields from the ndc table.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So I wrote a hash left lookup like this, however my output gives me blank values in the entire dataset. Nothing is populated.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data TEST(drop=rc:);
  if 0 then set ndc_table;
  if _N_ = 1 then do;
      declare hash h1(dataset:"ndc_table");
      h1.defineKey('NDC');
      h1.defineData(all:'Y');
      h1.defineDone();
  end;
  set claims;
  rc1=h1.find();
  if rc1 ne 0 then call missing(of _all_);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;If it's a "left join on the ndc key", then you need an "if test" to avoid outputing claims that don't match the ndc table.&amp;nbsp; So instead of&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;rc1=h1.find();
if rc1 ne 0 then call missing(of _all_);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;you're probably better off with&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;if h1.find()=0  then output;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If the NDC table is a miniscule fraction of the claims dataset, may be there are a few cases that aren't all blank … but since your program produces a dataset with the same number of records as claims, maybe you didn't see them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And of course you can eliminate the "drop=rc1" option in the data statement.&lt;/P&gt;</description>
      <pubDate>Fri, 03 Apr 2020 00:03:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Errors-in-Key-Indexing-Subset-Using-Array/m-p/637117#M189390</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-04-03T00:03:25Z</dc:date>
    </item>
  </channel>
</rss>

