BookmarkSubscribeRSS Feed
niemannk
Calcite | Level 5

The behavior of the hash left join using the  _N_=0 trick does produce unexpected results.

 

data hleftjoin;
  IF _N_=0 THEN SET RightData;
 IF _N_=1 THEN DO;
  declare hash lookupdata;
                lookupdata = _new_ hash(dataset:"RightData", ordered:"yes");
		lookupdata.defineKey('Key1','Key2');
		lookupdata.defineData('leftCol1','leftCol2');
		lookupdata.defineDone();
END;
SET LEftData end=EOF_daily;
	rc = lookupdata.find(KEY: Key1, KEY: Key2);
	
				
RUN;

produces for Key1 and Key2 data in LeftData with a rc value greater than 0 (ie not found in RightData) non-missing leftCol1 and LeftCol2 data. It seems to put the last leftCol1 &2 values in it. This can be easyly taken care of with a IF rc ne 0 THEN Call missing(LeftCol1, LeftCol2). It doesn't help to put a CALL MISSING right after the hash declaration (_N_=1).

 

However this doesn't occur when instead of the _N_=0 trick you use the LENGTH statement (LENGHT LeftCol1 LeftCol2 😎 or FORMAT statement.

 

I cannot make sense of it and this is error is not easly found.

 

 

 

2 REPLIES 2
Astounding
PROC Star

That's a characteristic of DATA steps in general, not of hashing in particular.

 

Any variable that comes from a SAS data set is automatically retained.  Doesn't matter whether the SAS data set is mentioned in a SET vs. MERGE vs. UPDATE statement ... if the source is a SAS data set, the variable is retained.

 

So when the SET statement appears, all variables from RightData are automatically retained.  The rest of it (when it happens, how to handle) it looks like you have figured out.

niemannk
Calcite | Level 5
ok. eventhough there is the never true_N_=0 if statement it still initiliazes the variables and then retains it until it gets overwritten. And this behaviour is not "activated" when using the LENGTH statement, because there is no SET.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 657 views
  • 2 likes
  • 2 in conversation