BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
left
Obsidian | Level 7

I try to set up several hash objects and be able to iterate through them with a "hash of hashes" object. I want to use the same dataset, but with different subsettings (where-clause empty or with some modifications). So I expect to load datasets with different sizes. 

 

For some reason the loaded "where"-clause is not in effect (see output screenshot below). I can't find the reason what I'm missing. 

 

 

data a;
    do id=1 to 50 by 1;
        value=19;
        output;
    end;
run;

data _null_;
    if 0 then set a;
    if _n_=1 then do;
        declare hash hoh(multidata: "yes");
        hoh.defineKey("table", "where");
        hoh.defineData("h", "table", "where");
        hoh.defineDone();
        declare hiter i_hoh("hoh");

        declare hash h();

        * For comparisons with input dataset --> "h" = rp19;
        table="a";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where),
                       multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        table="a";
        where="(where=(id<20))";
        h = _new_ hash(dataset: catx(" ", table, where),
                       multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        table="a (where=(id>40))";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where),
                       multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();
    end;
    * Iterate over "hash of hashes" object to reference the hash objects (rp19, rp23, rp49, rp79) like in an array;
    do while (i_hoh.next() = 0);
        rows = h.num_items;
        put (table rows) (=);
    end;
    stop;
run;

The log window gives me the following output:

 

 

image.png

 

1 ACCEPTED SOLUTION

Accepted Solutions
RichardDeVen
Barite | Level 11

You create a hash instance that memory leaks (albeit very small leak)

declare hash h();   * create uninitialized instance whilst adding to pdv;

In HoH the hash data element h only needs to be declared.  Of course, the _new_ instance must have occurred prior to .add() or .replace() time. So, simply declare the host reference, h, that will interact with your anonymous hashes.

declare hash h;    * add h, a hash reference, to PDV;

The table variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32.  If the table value is to further accommodate data set options you might want the length to be more like $200.  However, since the where clause is part of the key, you should exhibit discipline and assign only plain data set names to table and use the separate where, say $200, for loading subsets.

 

As stated elsewhere, the 'master' hash hoh does not need to be multidata in your situation.  Also, you do not need to repeatedly code table="<whatever>"; The value will not be altered by a prior hoh.add()

 

Example

data have;
    do id=1 to 50 by 1;
        value=19;
        output;
    end;
run;

data _null_;
    if 0 then set have;

    length table $41;
    length where $200;

    if _n_=1 then do;
        declare hash hoh();
        hoh.defineKey("table", "where");
        hoh.defineData("h", "table", "where");
        hoh.defineDone();

        declare hiter i_hoh("hoh");

        declare hash h;

        table="have";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id<20))";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id>40))"; 
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();
    end;

    do while (i_hoh.next() = 0);
        rows = h.num_items;
        put (table where rows) (=);
    end;

    stop;
run;

Log

NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
      WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
      WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where=  rows=50

 

View solution in original post

5 REPLIES 5
left
Obsidian | Level 7
The comment lines/ part of the comment lines can be ignored (non-relevant and maybe misleading):
--> * For comparisons with input dataset --> "h" = rp19;
--> (rp19, rp23, rp49, rp79)
FreelanceReinh
Jade | Level 19

Hi @left,

 

You just forgot to specify sufficient lengths for the character variables table and where, so that the intended values are truncated to one character.

For example:

length table where $20;

(but I'm sure you actually know that).

mkeintz
PROC Star

Also, I don't think you have any need for multidata:"yes" for hoh, once you have corrected the lengths of table and where.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
left
Obsidian | Level 7

Hi @mkeintz,

 

In my first attempt I built the "hoh" in the following manner: 

 

declare hash hoh(multidata: "no");
hoh.defineKey("table");
hoh.defineData("h", "table");
hoh.defineDone();

That resulted in an error in the log file: 

NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
      WHERE id<20;
ERROR: Duplicate key.
NOTE: There were 10 observations read from the data set WORK.A.
      WHERE id>40;
ERROR: Duplicate key.
table=a rows=50
NOTE: The SAS System stopped processing this step because of errors.
NOTE: DATA statement used (Total process time):
      real time           0.07 seconds
      cpu time            0.04 seconds

Therefore I changed the multidata option from "no" to "yes":

declare hash hoh(multidata: "yes");
hoh.defineKey("table");
hoh.defineData("h", "table");
hoh.defineDone();

This results in the following log output:

NOTE: There were 50 observations read from the data set WORK.A.
NOTE: There were 19 observations read from the data set WORK.A.
      WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.A.
      WHERE id>40;
table=a rows=50
table=a rows=19
table=a rows=10
NOTE: DATA statement used (Total process time):
      real time           0.06 seconds
      cpu time            0.04 seconds

 

 

However, splitting the key "table" into 2 parts ("table" + "where") requires no more multidata: "yes" as you highlighted:

declare hash hoh(multidata: "no");
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();

 

Therefore multidata option can be omitted: 

declare hash hoh();
hoh.defineKey("table", "where");
hoh.defineData("h", "table", "where");
hoh.defineDone();

 

RichardDeVen
Barite | Level 11

You create a hash instance that memory leaks (albeit very small leak)

declare hash h();   * create uninitialized instance whilst adding to pdv;

In HoH the hash data element h only needs to be declared.  Of course, the _new_ instance must have occurred prior to .add() or .replace() time. So, simply declare the host reference, h, that will interact with your anonymous hashes.

declare hash h;    * add h, a hash reference, to PDV;

The table variable should be $41 to accommodate longest plain <libref>.<dataset>, that being $8 + $1 + $32.  If the table value is to further accommodate data set options you might want the length to be more like $200.  However, since the where clause is part of the key, you should exhibit discipline and assign only plain data set names to table and use the separate where, say $200, for loading subsets.

 

As stated elsewhere, the 'master' hash hoh does not need to be multidata in your situation.  Also, you do not need to repeatedly code table="<whatever>"; The value will not be altered by a prior hoh.add()

 

Example

data have;
    do id=1 to 50 by 1;
        value=19;
        output;
    end;
run;

data _null_;
    if 0 then set have;

    length table $41;
    length where $200;

    if _n_=1 then do;
        declare hash hoh();
        hoh.defineKey("table", "where");
        hoh.defineData("h", "table", "where");
        hoh.defineDone();

        declare hiter i_hoh("hoh");

        declare hash h;

        table="have";
        where=" ";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id<20))";
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();

        where="(where=(id>40))"; 
        h = _new_ hash(dataset: catx(" ", table, where), multidata: "no");
        h.defineKey("id");
        h.defineData(all: "yes");
        h.defineDone();
        hoh.add();
    end;

    do while (i_hoh.next() = 0);
        rows = h.num_items;
        put (table where rows) (=);
    end;

    stop;
run;

Log

NOTE: There were 50 observations read from the data set WORK.HAVE.
NOTE: There were 19 observations read from the data set WORK.HAVE.
      WHERE id<20;
NOTE: There were 10 observations read from the data set WORK.HAVE.
      WHERE id>40;
table=have where=(where=(id<20)) rows=19
table=have where=(where=(id>40)) rows=10
table=have where=  rows=50

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1126 views
  • 3 likes
  • 4 in conversation