I have a dataset created 'abc' in a sasgrid location
I'm creating another dataset 'xyz' through that using hash table, for eg.,
data xyz;
if _n_=1 then do;
dcl hash hh (dataset: 'abc');
hh.definekey ('phoneno');
hh.definedone();
end;
followed by a bunch of keep statements
run;
when I try and run this, I get the following 2 errors:
ERROR: Undeclared key symbol phoneno for hash object
ERROR: DATA STEP Component Object failure. Aborted during the EXECUTION phase.
what does this even mean? what do I need to change to sort this? sorry I'm very new to hash tables
I have never used hash before, saw this piece of code being included in many places within a big project, can someone please help explain what each line means?
IF _n_ = 1 THEN DO;
Dcl Hash hh (dataset: 'work.CCEM');
hh.DefineKey ('tel);
hh.DefineDone ();
End;
It declares a hash object with one variable that acts as key. The object is populated from column tel in dataset work.ccem.
1. instantiate a hash object as soon as datastep loops past the data statement (_n_=1)
2. load the hash object with data by reading dataset work.CCEM
3. complete the instantiation with definedone
If I could add a follow up to this:
I have a dataset created 'abc' in a sasgrid location
I'm creating another dataset 'xyz' through that using hash table, for eg.,
data xyz;
if _n_=1 then do;
dcl hash hh (dataset: 'abc');
hh.definekey ('phoneno');
hh.definedone();
end;
followed by a bunch of keep statements
run;
when I try and run this, I get the following 2 errors:
ERROR: Undeclared key symbol phoneno for hash object
ERROR: DATA STEP Component Object failure. Aborted during the EXECUTION phase.
what does this even mean? what do I need to change to sort this? sorry I'm very new to hash tables
You need to set phoneno to a value so that the find() method works:
data abc;
input phoneno $;
cards;
1234567
;
run;
data xyz;
if _n_=1 then do;
dcl hash hh (dataset: 'abc');
hh.definekey ('phoneno');
hh.definedone();
end;
*phoneno = '1234567';
found = hh.find();
run;
Run that code, and you'll get your ERROR. Remove the comment asterisk, and it'll be OK.
PS if you do this:
data xyz;
length phoneno $8;
if _n_=1 then do;
dcl hash hh (dataset: 'abc');
hh.definekey ('phoneno');
hh.definedone();
end;
found = hh.find();
run;
there's no ERROR, but you won't find anything, as the empty string is not present as a key in the hash.
If your reference variable is present in another dataset, it will also work:
data def;
input phoneno $;
cards;
1234567
9865443
;
run;
data xyz;
set def;
if _n_=1 then do;
dcl hash hh (dataset: 'abc');
hh.definekey ('phoneno');
hh.definedone();
end;
found = hh.find();
run;
Just to add: Hash object hh will contain both a data item and a key item corresponding to variable tel, because the DefineData method is not used. (Or better: "would contain", if the closing single quote after "tel" wasn't missing.)
If you are an experienced data step programmer, I recommend that you familiarize yourself with the hash object. It's a really useful tool.
Instead of going that entire route, can I do a normal read from the dataset using
data xyz;
set abc;
keep ....
run;
but I was wondering why did whoever wrote the code (he left the company) add in the hash table anyway? what purpose was that serving?
If you don't have the phoneno variable in your 'abc' dataset then you will get this type of error. Make sure that variable is present in the dataset.
Then it is because phoneno is not present in the PDV for the xyz data set. Add an appropriate length statement or an if 0 then set abc like this
data abc;
phoneno=123;
run;
data xyz;
if 0 then set abc;
if _n_=1 then do;
dcl hash hh(dataset: 'abc');
hh.definekey('phoneno');
hh.definedone();
end;
run;
Or.. Simply remove the entire do group with the hash declaration if it does not serve a purpose.
Run a PROC CONTENTS step like this
proc contents data=abc;
run;
and verify that the variable phoneno is present in the data set abc.
Also, if the DO Group is only followed by a bunch of Keep Statements and the hash object hh is not used, simply remove the entire do group. No reason for him to have left it there.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.