BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
left
Obsidian | Level 7

I try to set up several hash objects and be able to iterate through them with a "hash of hashes" object. I want to use the same dataset, but with different subsettings (where-clause empty or with some modifications). So I expect to load datasets with different sizes. 

 

For some reason the loaded "where"-clause is not in effect (see output screenshot below). I can't find the reason what I'm missing. 

 

 

data a;
    do id=1 to 50 by 1;
        value=19;
        output;
    end;
run;

data _null_;
    if 0 then set a;
    if _n_=1 then do;
        declare hash hoh(multidata: "yes");
        hoh.defineKey("table", "where");
        hoh.defineData("h", "table", "where");
        hoh.defineDone();
        declare hiter i_hoh("hoh");

        declare hash h();

        * For comparisons with input dataset --> "h" = rp19;
        table="a";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where),
                       multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        table="a";
        where="(where=(id<20))";
        h = _new_ hash(dataset: catx(" ", table, where),
                       multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        table="a (where=(id>40))";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where),
                       multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();
    end;
    * Iterate over "hash of hashes" object to reference the hash objects (rp19, rp23, rp49, rp79) like in an array;
    do while (i_hoh.next() = 0);
        rows = h.num_items;
        put (table rows) (=);
    end;
    stop;
run;

The log window gives me the following output:

 

 

image.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
RichardDeVen
Barite | Level 11

You create a hash instance that memory leaks (albeit very small leak)

declare hash h();   * create uninitialized instance whilst adding to pdv;

In HoH the hash data element h only needs to be declared.  Of course, the _new_ instance must have occurred prior to .add() or .replace() time. So, simply declare the host reference, h, that will interact with your anonymous hashes.

declare hash h;    * add h, a hash reference, to PDV;

The table variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32.  If the table value is to further accommodate data set options you might want the length to be more like $200.  However, since the where clause is part of the key, you should exhibit discipline and assign only plain data set names to table and use the separate where, say $200, for loading subsets.

 

As stated elsewhere, the 'master' hash hoh does not need to be multidata in your situation.  Also, you do not need to repeatedly code table="<whatever>"; The value will not be altered by a prior hoh.add()

 

Example

data have;
    do id=1 to 50 by 1;
        value=19;
        output;
    end;
run;

data _null_;
    if 0 then set have;

    length table $41;
    length where $200;

    if _n_=1 then do;
        declare hash hoh();
        hoh.defineKey("table", "where");
        hoh.defineData("h", "table", "where");
        hoh.defineDone();

        declare hiter i_hoh("hoh");

        declare hash h;

        table="have";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id<20))";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id>40))"; 
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();
    end;

    do while (i_hoh.next() = 0);
        rows = h.num_items;
        put (table where rows) (=);
    end;

    stop;
run;

Log

NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
      WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
      WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where=  rows=50

 

View solution in original post

5 REPLIES 5
left
Obsidian | Level 7
The comment lines/ part of the comment lines can be ignored (non-relevant and maybe misleading):
--> * For comparisons with input dataset --> "h" = rp19;
--> (rp19, rp23, rp49, rp79)
FreelanceReinh
Jade | Level 19

Hi @left,

 

You just forgot to specify sufficient lengths for the character variables table and where, so that the intended values are truncated to one character.

For example:

length table where $20;

(but I'm sure you actually know that).

mkeintz
PROC Star

Also, I don't think you have any need for multidata:"yes" for hoh, once you have corrected the lengths of table and where.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
left
Obsidian | Level 7

Hi @mkeintz,

 

In my first attempt I built the "hoh" in the following manner: 

 

declare hash hoh(multidata: "no");
hoh.defineKey("table");
hoh.defineData("h", "table");
hoh.defineDone();

That resulted in an error in the log file: 

NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
      WHERE id<20;
ERROR: Duplicate key.
NOTE: There were 10 observations read from the data set WORK.A.
      WHERE id>40;
ERROR: Duplicate key.
table=a rows=50
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.04 seconds

Therefore I changed the multidata option from "no" to "yes":

declare hash hoh(multidata: "yes");
hoh.defineKey("table");
hoh.defineData("h", "table");
hoh.defineDone();

This results in the following log output:

NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
      WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.A.
      WHERE id>40;
table=a rows=50
table=a rows=19
table=a rows=10
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.04 seconds

 

 

However, splitting the key "table" into 2 parts ("table" + "where") requires no more multidata: "yes" as you highlighted:

declare hash hoh(multidata: "no");
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();

 

Therefore multidata option can be omitted: 

declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();

 

RichardDeVen
Barite | Level 11

You create a hash instance that memory leaks (albeit very small leak)

declare hash h();   * create uninitialized instance whilst adding to pdv;

In HoH the hash data element h only needs to be declared.  Of course, the _new_ instance must have occurred prior to .add() or .replace() time. So, simply declare the host reference, h, that will interact with your anonymous hashes.

declare hash h;    * add h, a hash reference, to PDV;

The table variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32.  If the table value is to further accommodate data set options you might want the length to be more like $200.  However, since the where clause is part of the key, you should exhibit discipline and assign only plain data set names to table and use the separate where, say $200, for loading subsets.

 

As stated elsewhere, the 'master' hash hoh does not need to be multidata in your situation.  Also, you do not need to repeatedly code table="<whatever>"; The value will not be altered by a prior hoh.add()

 

Example

data have;
    do id=1 to 50 by 1;
        value=19;
        output;
    end;
run;

data _null_;
    if 0 then set have;

    length table $41;
    length where $200;

    if _n_=1 then do;
        declare hash hoh();
        hoh.defineKey("table", "where");
        hoh.defineData("h", "table", "where");
        hoh.defineDone();

        declare hiter i_hoh("hoh");

        declare hash h;

        table="have";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id<20))";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id>40))"; 
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();
    end;

    do while (i_hoh.next() = 0);
        rows = h.num_items;
        put (table where rows) (=);
    end;

    stop;
run;

Log

NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
      WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
      WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where=  rows=50

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1247 views
  • 3 likes
  • 4 in conversation