I try to set up several hash objects and be able to iterate through them with a "hash of hashes" object. I want to use the same dataset, but with different subsettings (where-clause empty or with some modifications). So I expect to load datasets with different sizes.
For some reason the loaded "where"-clause is not in effect (see output screenshot below). I can't find the reason what I'm missing.
data a;
do id=1 to 50 by 1;
value=19;
output;
end;
run;
data _null_;
if 0 then set a;
if _n_=1 then do;
declare hash hoh(multidata: "yes");
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
declare hiter i_hoh("hoh");
declare hash h();
* For comparisons with input dataset --> "h" = rp19;
table="a";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where),
multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
table="a";
where="(where=(id<20))";
h = _new_ hash(dataset: catx(" ", table, where),
multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
table="a (where=(id>40))";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where),
multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
end;
* Iterate over "hash of hashes" object to reference the hash objects (rp19, rp23, rp49, rp79) like in an array;
do while (i_hoh.next() = 0);
rows = h.num_items;
put (table rows) (=);
end;
stop;
run;
The log window gives me the following output:
You create a hash instance that memory leaks (albeit very small leak)
declare hash h(); * create uninitialized instance whilst adding to pdv;
In HoH the hash data element h
only needs to be declared. Of course, the _new_ instance must have occurred prior to .add()
or .replace()
time. So, simply declare the host reference, h, that will interact with your anonymous hashes.
declare hash h; * add h, a hash reference, to PDV;
The table
variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32. If the table value is to further accommodate data set options you might want the length to be more like $200. However, since the where
clause is part of the key, you should exhibit discipline and assign only plain data set names to table
and use the separate where
, say $200, for loading subsets.
As stated elsewhere, the 'master' hash hoh
does not need to be multidata in your situation. Also, you do not need to repeatedly code table="<whatever>";
The value will not be altered by a prior hoh.add()
Example
data have;
do id=1 to 50 by 1;
value=19;
output;
end;
run;
data _null_;
if 0 then set have;
length table $41;
length where $200;
if _n_=1 then do;
declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
declare hiter i_hoh("hoh");
declare hash h;
table="have";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id<20))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id>40))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
end;
do while (i_hoh.next() = 0);
rows = h.num_items;
put (table where rows) (=);
end;
stop;
run;
Log
NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where= rows=50
Hi @left,
You just forgot to specify sufficient lengths for the character variables table and where, so that the intended values are truncated to one character.
For example:
length table where $20;
(but I'm sure you actually know that).
Also, I don't think you have any need for multidata:"yes" for hoh, once you have corrected the lengths of table and where.
Hi @mkeintz,
In my first attempt I built the "hoh" in the following manner:
declare hash hoh(multidata: "no");
hoh.defineKey("table");
hoh.defineData("h", "table");
hoh.defineDone();
That resulted in an error in the log file:
NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
WHERE id<20;
ERROR: Duplicate key.
NOTE: There were 10 observations read from the data set WORK.A.
WHERE id>40;
ERROR: Duplicate key.
table=a rows=50
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.04 seconds
Therefore I changed the multidata option from "no" to "yes":
declare hash hoh(multidata: "yes"); hoh.defineKey("table"); hoh.defineData("h", "table"); hoh.defineDone();
This results in the following log output:
NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.A.
WHERE id>40;
table=a rows=50
table=a rows=19
table=a rows=10
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.04 seconds
However, splitting the key "table" into 2 parts ("table" + "where") requires no more multidata: "yes" as you highlighted:
declare hash hoh(multidata: "no");
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
Therefore multidata option can be omitted:
declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
You create a hash instance that memory leaks (albeit very small leak)
declare hash h(); * create uninitialized instance whilst adding to pdv;
In HoH the hash data element h
only needs to be declared. Of course, the _new_ instance must have occurred prior to .add()
or .replace()
time. So, simply declare the host reference, h, that will interact with your anonymous hashes.
declare hash h; * add h, a hash reference, to PDV;
The table
variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32. If the table value is to further accommodate data set options you might want the length to be more like $200. However, since the where
clause is part of the key, you should exhibit discipline and assign only plain data set names to table
and use the separate where
, say $200, for loading subsets.
As stated elsewhere, the 'master' hash hoh
does not need to be multidata in your situation. Also, you do not need to repeatedly code table="<whatever>";
The value will not be altered by a prior hoh.add()
Example
data have;
do id=1 to 50 by 1;
value=19;
output;
end;
run;
data _null_;
if 0 then set have;
length table $41;
length where $200;
if _n_=1 then do;
declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
declare hiter i_hoh("hoh");
declare hash h;
table="have";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id<20))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id>40))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
end;
do while (i_hoh.next() = 0);
rows = h.num_items;
put (table where rows) (=);
end;
stop;
run;
Log
NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where= rows=50
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.