BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JW2702
Calcite | Level 5

I keep getting this error when I try to use a hash table:

ERROR: Duplicate key.

 

I have other tables that run fine, so presumably the code is fine, but this particular table will not work. I have done a proc sort nodupkey on exactly the same keys I am using and there are no duplicates in the table at all. So I don't understand why it says this particular table has a duplicate key.

 

This is the code:

 

data workjw.base2;
set workjw.base;

 

if _n_ = 0 then do;

     set workjw.acct_tpe_bin;

end;

 

if _n_ = 1 then do;

     declare hash b(dataset: "workjw.acct_tpe_bin", duplicate: "r");
     rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
     rc2=b.definedata("ACC_DESC_BIN");
     rc2=b.definedone();

     b.add();

end;

 

rc2=b.find();

 

if rc2 ne 0 then do;
ACC_DESC_BIN="";
end;

 

run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@KachiM

As soon as you remove b.add() things will work.

libname workjw (work);

data workjw.acct_tpe_bin;
  CH_ACCT_NO_BIN=1;
  ACCT_TPE_CDE=1;
  ACC_DESC_BIN=1;
  output;
  CH_ACCT_NO_BIN=1;
  ACCT_TPE_CDE=1;
  ACC_DESC_BIN=1;
  output;
run;

data workjw.base;
  CH_ACCT_NO_BIN=1;
  ACCT_TPE_CDE=1;
  output;

  CH_ACCT_NO_BIN=2;
  ACCT_TPE_CDE=2;
  output;

run;



data workjw.base2;
  set workjw.base;

  if _n_ = 0 then
    do;
      set workjw.acct_tpe_bin;
    end;

  if _n_ = 1 then
    do;
      declare hash b(dataset: "workjw.acct_tpe_bin");
      rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
      rc2=b.definedata("ACC_DESC_BIN");
      rc2=b.definedone();
/*      b.add();*/
    end;

  rc2=b.find();

  if rc2 ne 0 then
    do;
      ACC_DESC_BIN="";
    end;
run;

View solution in original post

9 REPLIES 9
JW2702
Calcite | Level 5

SAS 9.4, running on a Windows Server.

I have also tried closing and starting in a new session, but that made no difference at all.

KachiM
Rhodochrosite | Level 12

Using the following statement,

 

 declare hash b(dataset: "workjw.acct_tpe_bin", duplicate: "r");

 

you are asking to add the last value, replacing the previous values. This means that if there are duplicate records, they will be happily replaced and no ERROR message will come. 

 

Your Duplicate Error message is strange!

 

Place your LOG to see any clues.

ChrisBrooks
Ammonite | Level 13

I believe the 'duplicate: "r"' statement only refers to the initial load of the data set. The error is coming on the add() method. If you supply a return value for the error code on that e.g. rc=b.add() you won't get the error. Although you appear to be loading the same information twice anyway so I'm not sure why you need to call b.add() at all.

JW2702
Calcite | Level 5

It just has this:

 

NOTE: There were 147 observations read from the data set workjw.acct_tpe_bin
ERROR: Duplicate key.

 

If I try sorting with nodupkey, it gives no errors:

2590 proc sort data=workjw.acct_tpe_bin force nodupkey; by ACCT_TPE_CDE CH_ACCT_NO_BIN ; run;

NOTE: There were 147 observations read from the data set WORKJW.ACCT_TPE_BIN.
NOTE: 0 observations with duplicate key values were deleted.
NOTE: The data set WORKJW.ACCT_TPE_BIN has 147 observations and 3 variables.

 

I checked that the field I am using as a key is the same length and format on both tables. They are.

ChrisBrooks
Ammonite | Level 13

As I said above - the whole data set is loaded into the hash table when it is declared - you don't need to exclude the duplicate key records. You are then trying to add the first record into the hash again with b.add() which is where the error comes

Patrick
Opal | Level 21

@JW2702

I'm repeating here what @ChrisBrooks already wrote but may be posting a bit of code will make the issue clearer to you.

 

Below statement...

declare hash b(dataset: "workjw.acct_tpe_bin");

...loads the whole source table into the hash. The default for a source table having duplicate keys is: Only the first occurence of a key combination will get loaded into the hash. Any later duplicate key records will be ignored. The process will run without error or warning.

 

The problematic part in your code is: b.add();

  if _n_ = 1 then
    do;
      declare hash b(dataset: "workjw.acct_tpe_bin");
      .....
      b.add();
    end;

You've already loaded the whole table deduped into the hash. Now with this b.add() you're instructing SAS to load observation 1 (if _n_=1) once more into the hash. ...but you've loaded the table already via the dataset:<dataset> directive in the declare statement so b.add() is now trying to load a duplicate. That's what causes the error.

 

The default setting for a hash table is to not accept duplicate keys:

- Remove command b.add() and everything will work as expected.

- No need to pre-sort and dedupe the source table for the hash prior to loading. Statement Declare Hash will take care of this.

 

And here a bit a shorter and working code version:


data workjw.base2;
  set workjw.base;

  if _n_ = 1 then
    do;
      if 0 then set workjw.acct_tpe_bin(keep=CH_ACCT_NO_BIN ACCT_TPE_CDE ACC_DESC_BIN);
      declare hash b(dataset: "workjw.acct_tpe_bin");
      b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
      b.definedata("ACC_DESC_BIN");
      b.definedone();
    end;

  if b.find() then call missing(ACC_DESC_BIN);

run;

 

KachiM
Rhodochrosite | Level 12

In addition to the observations made by others, why not you try the following stripped code and see whether you still get the ERROR.

 

data workjw.base2;
 
if _n_ = 0 then do;
     set workjw.acct_tpe_bin;
end;
 
if _n_ = 1 then do;
     declare hash b(dataset: "workjw.acct_tpe_bin");
     rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
     rc2=b.definedata("ACC_DESC_BIN");
     rc2=b.definedone();
end;
 
rc = b.output(dataset:'out_01');
 
run;
Patrick
Opal | Level 21

@KachiM

As soon as you remove b.add() things will work.

libname workjw (work);

data workjw.acct_tpe_bin;
  CH_ACCT_NO_BIN=1;
  ACCT_TPE_CDE=1;
  ACC_DESC_BIN=1;
  output;
  CH_ACCT_NO_BIN=1;
  ACCT_TPE_CDE=1;
  ACC_DESC_BIN=1;
  output;
run;

data workjw.base;
  CH_ACCT_NO_BIN=1;
  ACCT_TPE_CDE=1;
  output;

  CH_ACCT_NO_BIN=2;
  ACCT_TPE_CDE=2;
  output;

run;



data workjw.base2;
  set workjw.base;

  if _n_ = 0 then
    do;
      set workjw.acct_tpe_bin;
    end;

  if _n_ = 1 then
    do;
      declare hash b(dataset: "workjw.acct_tpe_bin");
      rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
      rc2=b.definedata("ACC_DESC_BIN");
      rc2=b.definedone();
/*      b.add();*/
    end;

  rc2=b.find();

  if rc2 ne 0 then
    do;
      ACC_DESC_BIN="";
    end;
run;

JW2702
Calcite | Level 5

Thanks. I was playing around and changed the add to say "rc2=b.add();" and that worked.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2866 views
  • 0 likes
  • 4 in conversation