I keep getting this error when I try to use a hash table:
ERROR: Duplicate key.
I have other tables that run fine, so presumably the code is fine, but this particular table will not work. I have done a proc sort nodupkey on exactly the same keys I am using and there are no duplicates in the table at all. So I don't understand why it says this particular table has a duplicate key.
This is the code:
data workjw.base2;
set workjw.base;
if _n_ = 0 then do;
set workjw.acct_tpe_bin;
end;
if _n_ = 1 then do;
declare hash b(dataset: "workjw.acct_tpe_bin", duplicate: "r");
rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
rc2=b.definedata("ACC_DESC_BIN");
rc2=b.definedone();
b.add();
end;
rc2=b.find();
if rc2 ne 0 then do;
ACC_DESC_BIN="";
end;
run;
As soon as you remove b.add() things will work.
libname workjw (work);
data workjw.acct_tpe_bin;
CH_ACCT_NO_BIN=1;
ACCT_TPE_CDE=1;
ACC_DESC_BIN=1;
output;
CH_ACCT_NO_BIN=1;
ACCT_TPE_CDE=1;
ACC_DESC_BIN=1;
output;
run;
data workjw.base;
CH_ACCT_NO_BIN=1;
ACCT_TPE_CDE=1;
output;
CH_ACCT_NO_BIN=2;
ACCT_TPE_CDE=2;
output;
run;
data workjw.base2;
set workjw.base;
if _n_ = 0 then
do;
set workjw.acct_tpe_bin;
end;
if _n_ = 1 then
do;
declare hash b(dataset: "workjw.acct_tpe_bin");
rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
rc2=b.definedata("ACC_DESC_BIN");
rc2=b.definedone();
/* b.add();*/
end;
rc2=b.find();
if rc2 ne 0 then
do;
ACC_DESC_BIN="";
end;
run;
SAS 9.4, running on a Windows Server.
I have also tried closing and starting in a new session, but that made no difference at all.
Using the following statement,
declare hash b(dataset: "workjw.acct_tpe_bin", duplicate: "r");
you are asking to add the last value, replacing the previous values. This means that if there are duplicate records, they will be happily replaced and no ERROR message will come.
Your Duplicate Error message is strange!
Place your LOG to see any clues.
I believe the 'duplicate: "r"' statement only refers to the initial load of the data set. The error is coming on the add() method. If you supply a return value for the error code on that e.g. rc=b.add() you won't get the error. Although you appear to be loading the same information twice anyway so I'm not sure why you need to call b.add() at all.
It just has this:
NOTE: There were 147 observations read from the data set workjw.acct_tpe_bin
ERROR: Duplicate key.
If I try sorting with nodupkey, it gives no errors:
2590 proc sort data=workjw.acct_tpe_bin force nodupkey; by ACCT_TPE_CDE CH_ACCT_NO_BIN ; run;
NOTE: There were 147 observations read from the data set WORKJW.ACCT_TPE_BIN.
NOTE: 0 observations with duplicate key values were deleted.
NOTE: The data set WORKJW.ACCT_TPE_BIN has 147 observations and 3 variables.
I checked that the field I am using as a key is the same length and format on both tables. They are.
As I said above - the whole data set is loaded into the hash table when it is declared - you don't need to exclude the duplicate key records. You are then trying to add the first record into the hash again with b.add() which is where the error comes
I'm repeating here what @ChrisBrooks already wrote but may be posting a bit of code will make the issue clearer to you.
Below statement...
declare hash b(dataset: "workjw.acct_tpe_bin");
...loads the whole source table into the hash. The default for a source table having duplicate keys is: Only the first occurence of a key combination will get loaded into the hash. Any later duplicate key records will be ignored. The process will run without error or warning.
The problematic part in your code is: b.add();
if _n_ = 1 then do; declare hash b(dataset: "workjw.acct_tpe_bin"); ..... b.add(); end;
You've already loaded the whole table deduped into the hash. Now with this b.add() you're instructing SAS to load observation 1 (if _n_=1) once more into the hash. ...but you've loaded the table already via the dataset:<dataset> directive in the declare statement so b.add() is now trying to load a duplicate. That's what causes the error.
The default setting for a hash table is to not accept duplicate keys:
- Remove command b.add() and everything will work as expected.
- No need to pre-sort and dedupe the source table for the hash prior to loading. Statement Declare Hash will take care of this.
And here a bit a shorter and working code version:
data workjw.base2;
set workjw.base;
if _n_ = 1 then
do;
if 0 then set workjw.acct_tpe_bin(keep=CH_ACCT_NO_BIN ACCT_TPE_CDE ACC_DESC_BIN);
declare hash b(dataset: "workjw.acct_tpe_bin");
b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
b.definedata("ACC_DESC_BIN");
b.definedone();
end;
if b.find() then call missing(ACC_DESC_BIN);
run;
In addition to the observations made by others, why not you try the following stripped code and see whether you still get the ERROR.
data workjw.base2;
if _n_ = 0 then do;
set workjw.acct_tpe_bin;
end;
if _n_ = 1 then do;
declare hash b(dataset: "workjw.acct_tpe_bin");
rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
rc2=b.definedata("ACC_DESC_BIN");
rc2=b.definedone();
end;
rc = b.output(dataset:'out_01');
run;
As soon as you remove b.add() things will work.
libname workjw (work);
data workjw.acct_tpe_bin;
CH_ACCT_NO_BIN=1;
ACCT_TPE_CDE=1;
ACC_DESC_BIN=1;
output;
CH_ACCT_NO_BIN=1;
ACCT_TPE_CDE=1;
ACC_DESC_BIN=1;
output;
run;
data workjw.base;
CH_ACCT_NO_BIN=1;
ACCT_TPE_CDE=1;
output;
CH_ACCT_NO_BIN=2;
ACCT_TPE_CDE=2;
output;
run;
data workjw.base2;
set workjw.base;
if _n_ = 0 then
do;
set workjw.acct_tpe_bin;
end;
if _n_ = 1 then
do;
declare hash b(dataset: "workjw.acct_tpe_bin");
rc2=b.definekey("CH_ACCT_NO_BIN", "ACCT_TPE_CDE");
rc2=b.definedata("ACC_DESC_BIN");
rc2=b.definedone();
/* b.add();*/
end;
rc2=b.find();
if rc2 ne 0 then
do;
ACC_DESC_BIN="";
end;
run;
Thanks. I was playing around and changed the add to say "rc2=b.add();" and that worked.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.