BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
whymath
Lapis Lazuli | Level 10

This is an example of submiting hash declare statements with "if _n_=1" block:

data want;
  if _n_=1 then do;
    declare hash h(dataset:"sashelp.class(where=((weight*0.4536)/(height*0.0254)**2>20))");
    h.definekey('age'); 
    h.definedone();
  end;
  do until(_eof_);
    set sashelp.class end=_eof_;
    bmi=(weight*0.4536)/(height*0.0254)**2;
    if h.check()=0 then output;
  end;
run;

SAS log shows the lookup table was read 1 times:

NOTE: There were 5 observations read from the data set SASHELP.CLASS.
      WHERE ((weight*0.4536)/((height*0.0254)**2))>20;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.WANT has 10 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.01 seconds

Removing the "if _n_=1" block":

data want;
  declare hash h(dataset:"sashelp.class(where=((weight*0.4536)/(height*0.0254)**2>20))");
  h.definekey('age'); 
  h.definedone();
  do until(_eof_);
    set sashelp.class end=_eof_;
    bmi=(weight*0.4536)/(height*0.0254)**2;
    if h.check()=0 then output;
  end;
run;

SAS log shows the lookup table was read 2 times:

NOTE: There were 5 observations read from the data set SASHELP.CLASS.
      WHERE ((weight*0.4536)/((height*0.0254)**2))>20;
NOTE: There were 5 observations read from the data set SASHELP.CLASS.
      WHERE ((weight*0.4536)/((height*0.0254)**2))>20;
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.WANT has 10 observations and 6 variables.
NOTE: DATA statement used (Total process time):
      real time           0.02 seconds
      cpu time            0.01 seconds

What makes that difference?

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

This has nothing to do with the hash table definition but with the way you're using the set statement. The data step will iterate twice in such a scenario.

Patrick_0-1756715523184.png


Add a STOP statement at the end of your data step to avoid this behaviour. 

Patrick_1-1756715602917.png

 

View solution in original post

4 REPLIES 4
Patrick
Opal | Level 21

This has nothing to do with the hash table definition but with the way you're using the set statement. The data step will iterate twice in such a scenario.

Patrick_0-1756715523184.png


Add a STOP statement at the end of your data step to avoid this behaviour. 

Patrick_1-1756715602917.png

 

whymath
Lapis Lazuli | Level 10
Thank you. I am shocked, does that means the supervisor will return control to the top of data step, after running DOW-Loop? What's the usage of this feature?
Tom
Super User Tom
Super User

@whymath wrote:
Thank you. I am shocked, does that means the supervisor will return control to the top of data step, after running DOW-Loop? What's the usage of this feature?

That is how the data step works.   It does not care (or even really know) that you put the SET statement inside a DO loop.

 

Most data steps are like your first one, they stop when they read past their inputs.

2    data _null_;
3      put _n_= eof= 'BEFORE';
4      set oneobs end=eof;
5      put _n_= eof= 'AFTER';
6    run;

_N_=1 eof=0 BEFORE
_N_=1 eof=1 AFTER
_N_=2 eof=1 BEFORE
NOTE: There were 1 observations read from the data set WORK.ONEOBS.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 second

But you can explicitly tell it when to stop, like in your second one.

8    data _null_;
9      put _n_= eof= 'BEFORE';
10     set oneobs end=eof;
11     put _n_= eof= 'AFTER';
12     stop;
13   run;

_N_=1 eof=0 BEFORE
_N_=1 eof=1 AFTER
NOTE: There were 1 observations read from the data set WORK.ONEOBS.
NOTE: DATA statement used (Total process time):
      real time           0.01 seconds
      cpu time            0.00 seconds

Or if there are no inputs at all then it knows not to start are second iteration.

15   data _null_;
16     put _n_= bmi= 'BEFORE';
17     bmi=(180*0.4536)/(75*0.0254)**2;
18     put _n_= bmi= 'AFTER';
19   run;

_N_=1 bmi=. BEFORE
_N_=1 bmi=22.498604997 AFTER
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds

And if there are inputs but you complete an iteration without actually advancing any of the inputs then it stops and says it is because of looping.

21   data _null_;
22     put _n_= eof= 'BEFORE';
23     if _N_=1 then set oneobs end=eof;
24     put _n_= eof= 'AFTER';
25   run;

_N_=1 eof=0 BEFORE
_N_=1 eof=1 AFTER
_N_=2 eof=1 BEFORE
_N_=2 eof=1 AFTER
NOTE: DATA STEP stopped due to looping.
NOTE: There were 1 observations read from the data set WORK.ONEOBS.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
whymath
Lapis Lazuli | Level 10
I know this is not hash's business, please see my another post.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 165 views
  • 2 likes
  • 3 in conversation