- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why do you have to define call missing(variables) when define a hash table?
I thought during the execution phase of data step, the variables will automatically be assigned a missing value so why do you still have to use call missing routine in the hash.
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The purpose of call missing routine, which btw can also be replaced by Length/Retain statement, is to prepare PDV for Hash operations. Most of Hash operations (if not all) will involve PDV, it will first pull the data from Hash object into PDV and do something. That being said, if you already have your Hash ready PDV setup, then you don't always need it. One example as:
data _null_;
if _n_=1 then do;
dcl hash h(dataset:'sashelp.class', multidata:'y');
- h.definekey('name');
- h.definedata('name');
- h.definedone();
end;
if h.find(key:'Jane')=0 then put "Found One";
set sashelp.class(keep=name obs=0);
run;
Also please note the set sashelp.class(keep=name obs=0); can be placed anywhere inside the data step, as it prepares PDV during the compiling time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yea, that's my understanding as well. just curious why people put call missing and length statement together...
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Call missing will take any first time-met variables as Number. Consider the following scenario: if you have v1 as number and v2 as char, then a single call missing won't do it. you have to : 1. length statement to define the char type and the length for v2, 2. Call missing v1, v2 or just v1 for that matter.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have test it out.
it seems you don't have to specify length at the top for character and numerical variable to be able to use call missing for both.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
That is because you have defined it using a implicit way: assign a value to it. Like I said, whatever you do, the purpose is to have PDV ready before you can operate on Hash. The danger of using a value instead of a standard Length statement is that you need to make sure this value covers the longest length this variable will ever need. In this case, prod='shoes', the length is 5. if your data mean to have prod='trousers', you will only end up with prod='trous'. This is why it is not a common practice.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Oh, maybe here is the confusion. When I say :"Call missing will take any first time-met variables as Number.", I did not mean first time met by call missing (). I mean it is the first time for this variable to show up in one data step, met by you. In your example, when call missing meet 'prod', this is the second time 'prod' show up, the first time is where it has been defined by being assigned a value 'shoes'. Hope this clears things up for you.