03-23-2015 07:01 PM
Why do you have to define call missing(variables) when define a hash table?
I thought during the execution phase of data step, the variables will automatically be assigned a missing value so why do you still have to use call missing routine in the hash.
03-23-2015 08:04 PM
The purpose of call missing routine, which btw can also be replaced by Length/Retain statement, is to prepare PDV for Hash operations. Most of Hash operations (if not all) will involve PDV, it will first pull the data from Hash object into PDV and do something. That being said, if you already have your Hash ready PDV setup, then you don't always need it. One example as:
if _n_=1 then do;
dcl hash h(dataset:'sashelp.class', multidata:'y');
if h.find(key:'Jane')=0 then put "Found One";
set sashelp.class(keep=name obs=0);
Also please note the set sashelp.class(keep=name obs=0); can be placed anywhere inside the data step, as it prepares PDV during the compiling time.
03-23-2015 08:57 PM
Yea, that's my understanding as well. just curious why people put call missing and length statement together...
03-23-2015 09:09 PM
Call missing will take any first time-met variables as Number. Consider the following scenario: if you have v1 as number and v2 as char, then a single call missing won't do it. you have to : 1. length statement to define the char type and the length for v2, 2. Call missing v1, v2 or just v1 for that matter.
03-23-2015 09:39 PM
I have test it out.
it seems you don't have to specify length at the top for character and numerical variable to be able to use call missing for both.
03-23-2015 09:40 PM
03-23-2015 09:46 PM
That is because you have defined it using a implicit way: assign a value to it. Like I said, whatever you do, the purpose is to have PDV ready before you can operate on Hash. The danger of using a value instead of a standard Length statement is that you need to make sure this value covers the longest length this variable will ever need. In this case, prod='shoes', the length is 5. if your data mean to have prod='trousers', you will only end up with prod='trous'. This is why it is not a common practice.
03-23-2015 09:57 PM
Oh, maybe here is the confusion. When I say :"Call missing will take any first time-met variables as Number.", I did not mean first time met by call missing (). I mean it is the first time for this variable to show up in one data step, met by you. In your example, when call missing meet 'prod', this is the second time 'prod' show up, the first time is where it has been defined by being assigned a value 'shoes'. Hope this clears things up for you.