BookmarkSubscribeRSS Feed
gyambqt
Obsidian | Level 7

Why do you have to define call missing(variables) when define a hash table?

I thought during the execution phase of data step, the variables will automatically be assigned a missing value so why do you still have to use call missing routine in the hash.

Thanks

7 REPLIES 7
Haikuo
Onyx | Level 15

The purpose of call missing routine, which btw can also be replaced by Length/Retain statement, is to prepare PDV for Hash operations. Most of Hash operations (if not all) will involve PDV, it will first pull the data from Hash object into PDV and do something. That being said, if you already have your Hash ready PDV setup, then you don't always need it. One example as:

data _null_;

if _n_=1 then do;

dcl hash h(dataset:'sashelp.class', multidata:'y');

  1. h.definekey('name');
  2. h.definedata('name');
  3. h.definedone();

end;

if h.find(key:'Jane')=0 then put "Found One";

set sashelp.class(keep=name obs=0);

run;

Also please note the set sashelp.class(keep=name obs=0); can be placed anywhere inside the data step, as it prepares PDV during the compiling time.

gyambqt
Obsidian | Level 7

Yea, that's my understanding as well. just curious why people put call missing and length statement together...

Haikuo
Onyx | Level 15

Call missing will take any first time-met variables as Number. Consider the following scenario: if you have v1 as number and v2 as char, then a single call missing won't do it. you have to : 1. length statement to define the char type and the length for v2, 2. Call missing v1, v2 or just v1 for that matter.

gyambqt
Obsidian | Level 7

I have test it out.
it seems you don't have to specify length at the top for character and numerical variable to be able to use call missing for both.

gyambqt
Obsidian | Level 7

prod='shoes';

invty=7498;

sales=23759;

call missing(prod,invty);

put prod= invty= sales=;

results:

prod= invty=. sales=23759

Haikuo
Onyx | Level 15

That is because you have defined it using a implicit way: assign a value to it. Like I said, whatever you do, the purpose is to have PDV ready before you can operate on Hash. The danger of using a value instead of a standard Length statement is that you need to make sure this value covers the longest length this variable will ever need. In this case, prod='shoes', the length is 5. if your data mean to have prod='trousers', you will only end up with prod='trous'. This is why it is not a common practice.

Haikuo
Onyx | Level 15

Oh, maybe here is the confusion. When I say :"Call missing will take any first time-met variables as Number.", I did not mean first time met by call missing (). I mean it is the first time for this variable to show up in one data step, met by you.  In your example, when call missing meet 'prod', this is the second time 'prod' show up, the first time is where it has been defined by being assigned a value 'shoes'. Hope this clears things up for you.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 5098 views
  • 0 likes
  • 2 in conversation