I try to set up several hash objects and be able to iterate through them with a "hash of hashes" object. I want to use the same dataset, but with different subsettings (where-clause empty or with some modifications). So I expect to load datasets with different sizes.
For some reason the loaded "where"-clause is not in effect (see output screenshot below). I can't find the reason what I'm missing.
data a;
do id=1 to 50 by 1;
value=19;
output;
end;
run;
data _null_;
if 0 then set a;
if _n_=1 then do;
declare hash hoh(multidata: "yes");
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
declare hiter i_hoh("hoh");
declare hash h();
* For comparisons with input dataset --> "h" = rp19;
table="a";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where),
multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
table="a";
where="(where=(id<20))";
h = _new_ hash(dataset: catx(" ", table, where),
multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
table="a (where=(id>40))";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where),
multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
end;
* Iterate over "hash of hashes" object to reference the hash objects (rp19, rp23, rp49, rp79) like in an array;
do while (i_hoh.next() = 0);
rows = h.num_items;
put (table rows) (=);
end;
stop;
run;
The log window gives me the following output:
You create a hash instance that memory leaks (albeit very small leak)
declare hash h(); * create uninitialized instance whilst adding to pdv;
In HoH the hash data element h
only needs to be declared. Of course, the _new_ instance must have occurred prior to .add()
or .replace()
time. So, simply declare the host reference, h, that will interact with your anonymous hashes.
declare hash h; * add h, a hash reference, to PDV;
The table
variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32. If the table value is to further accommodate data set options you might want the length to be more like $200. However, since the where
clause is part of the key, you should exhibit discipline and assign only plain data set names to table
and use the separate where
, say $200, for loading subsets.
As stated elsewhere, the 'master' hash hoh
does not need to be multidata in your situation. Also, you do not need to repeatedly code table="<whatever>";
The value will not be altered by a prior hoh.add()
Example
data have;
do id=1 to 50 by 1;
value=19;
output;
end;
run;
data _null_;
if 0 then set have;
length table $41;
length where $200;
if _n_=1 then do;
declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
declare hiter i_hoh("hoh");
declare hash h;
table="have";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id<20))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id>40))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
end;
do while (i_hoh.next() = 0);
rows = h.num_items;
put (table where rows) (=);
end;
stop;
run;
Log
NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where= rows=50
Hi @left,
You just forgot to specify sufficient lengths for the character variables table and where, so that the intended values are truncated to one character.
For example:
length table where $20;
(but I'm sure you actually know that).
Also, I don't think you have any need for multidata:"yes" for hoh, once you have corrected the lengths of table and where.
Hi @mkeintz,
In my first attempt I built the "hoh" in the following manner:
declare hash hoh(multidata: "no");
hoh.defineKey("table");
hoh.defineData("h", "table");
hoh.defineDone();
That resulted in an error in the log file:
NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
WHERE id<20;
ERROR: Duplicate key.
NOTE: There were 10 observations read from the data set WORK.A.
WHERE id>40;
ERROR: Duplicate key.
table=a rows=50
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.04 seconds
Therefore I changed the multidata option from "no" to "yes":
declare hash hoh(multidata: "yes"); hoh.defineKey("table"); hoh.defineData("h", "table"); hoh.defineDone();
This results in the following log output:
NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.A.
WHERE id>40;
table=a rows=50
table=a rows=19
table=a rows=10
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.04 seconds
However, splitting the key "table" into 2 parts ("table" + "where") requires no more multidata: "yes" as you highlighted:
declare hash hoh(multidata: "no");
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
Therefore multidata option can be omitted:
declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
You create a hash instance that memory leaks (albeit very small leak)
declare hash h(); * create uninitialized instance whilst adding to pdv;
In HoH the hash data element h
only needs to be declared. Of course, the _new_ instance must have occurred prior to .add()
or .replace()
time. So, simply declare the host reference, h, that will interact with your anonymous hashes.
declare hash h; * add h, a hash reference, to PDV;
The table
variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32. If the table value is to further accommodate data set options you might want the length to be more like $200. However, since the where
clause is part of the key, you should exhibit discipline and assign only plain data set names to table
and use the separate where
, say $200, for loading subsets.
As stated elsewhere, the 'master' hash hoh
does not need to be multidata in your situation. Also, you do not need to repeatedly code table="<whatever>";
The value will not be altered by a prior hoh.add()
Example
data have;
do id=1 to 50 by 1;
value=19;
output;
end;
run;
data _null_;
if 0 then set have;
length table $41;
length where $200;
if _n_=1 then do;
declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();
declare hiter i_hoh("hoh");
declare hash h;
table="have";
where=" ";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id<20))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
where="(where=(id>40))";
h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
h.defineKey("id");
h.defineData(all: "yes");
h.defineDone();
hoh.add();
end;
do while (i_hoh.next() = 0);
rows = h.num_items;
put (table where rows) (=);
end;
stop;
run;
Log
NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where= rows=50
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.