DATA Step, Macro, Functions and more

The hash left join using the _N_=0 trick does produce unexpected results.

Reply
Contributor
Posts: 31

The hash left join using the _N_=0 trick does produce unexpected results.

The behavior of the hash left join using the  _N_=0 trick does produce unexpected results.

 

data hleftjoin;
  IF _N_=0 THEN SET RightData;
 IF _N_=1 THEN DO;
  declare hash lookupdata;
                lookupdata = _new_ hash(dataset:"RightData", ordered:"yes");
		lookupdata.defineKey('Key1','Key2');
		lookupdata.defineData('leftCol1','leftCol2');
		lookupdata.defineDone();
END;
SET LEftData end=EOF_daily;
	rc = lookupdata.find(KEY: Key1, KEY: Key2);
	
				
RUN;

produces for Key1 and Key2 data in LeftData with a rc value greater than 0 (ie not found in RightData) non-missing leftCol1 and LeftCol2 data. It seems to put the last leftCol1 &2 values in it. This can be easyly taken care of with a IF rc ne 0 THEN Call missing(LeftCol1, LeftCol2). It doesn't help to put a CALL MISSING right after the hash declaration (_N_=1).

 

However this doesn't occur when instead of the _N_=0 trick you use the LENGTH statement (LENGHT LeftCol1 LeftCol2 8) or FORMAT statement.

 

I cannot make sense of it and this is error is not easly found.

 

 

 

Super User
Posts: 6,933

Re: The hash left join using the _N_=0 trick does produce unexpected results.

That's a characteristic of DATA steps in general, not of hashing in particular.

 

Any variable that comes from a SAS data set is automatically retained.  Doesn't matter whether the SAS data set is mentioned in a SET vs. MERGE vs. UPDATE statement ... if the source is a SAS data set, the variable is retained.

 

So when the SET statement appears, all variables from RightData are automatically retained.  The rest of it (when it happens, how to handle) it looks like you have figured out.

Contributor
Posts: 31

Re: The hash left join using the _N_=0 trick does produce unexpected results.

Posted in reply to Astounding
ok. eventhough there is the never true_N_=0 if statement it still initiliazes the variables and then retains it until it gets overwritten. And this behaviour is not "activated" when using the LENGTH statement, because there is no SET.
Ask a Question
Discussion stats
  • 2 replies
  • 92 views
  • 2 likes
  • 2 in conversation