DATA Step, Macro, Functions and more

use of index

Reply
Contributor
Posts: 33

use of index

I got an unexpected result with this code:

 

 

data BHS;
input class1	class2:$50. class3:$40. value;
datalines;
1 xxx name1 0
1 xxx name2 0
run;
	
data WTG;
input class1	class2:$50.	class3:$40. value;
datalines;	
1 xxx name1 1
1 xxx name2 2
run;

proc datasets library = work nolist;
	 modify BHS;
	 index create ts_di2 = ( class1 class2 class3) / nomiss;
quit;

		
			data BHS;  
			matched_flag='N';
			set WTG (rename=(value=value_mese class3 = class3_mese));
			do until (_iorc_=%sysrc(_dsenom)); 
			modify BHS key=ts_di2;  
			select(_iorc_); 
				  when(%sysrc(_sok)) do;
					 value = value_mese;
					 class3 = class3_mese;
					 matched_flag='Y'; put "match"  _iorc_= _n_=;
					 _error_=0;
					 replace;
				  end;
				  when(%sysrc(_dsenom)) do;
					 value = value_mese;
					 class3 = class3_mese;
					 _error_ = 0;  
					 if matched_flag='N' then do;  
                         put "no_match"  _iorc_= _n_=;
                         output; 
                     end;
				  end;
				  otherwise do;
					 put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
					 put 'Program terminating. DATA step iteration # ' _n_;
					 put _all_;
					 stop;
				  end;
			   end;
			end;
			run;			

The result was:

 

"There were 3 observations rewritten, 1
observations added and 0 observations deleted."

Instead I expected  2 observations rewritten and 0 observations added, as there are no duplicates. I do not understand why in the first lap of the cycle enters in the "no match", as all match records

 

I expected this outcome for the table BHS:

data BHS;;
input class1 class2:$50. class3:$40. value;
datalines;
1 xxx name1 1
1 xxx name2 2
run;

Trusted Advisor
Posts: 1,115

Re: use of index

[ Edited ]

Hello @mario_pellegrini,

 

The data step debugger can be used to investigate the long data step:

  1. Add the DEBUG option
    data BHS / debug;
    matched_flag='N';
    ...
  2. Run the data step.
  3. Resize the windows so that you can see at least the SAS log in addition to the new windows "Debugger Log" and "Debugger Source."
  4. Enter the following command into the command line at the bottom of the Debugger Log:
    enter ex _all_; st
    i.e., assign the command "examine all variables and execute the next statement" to the ENTER (Return) key.
  5. Press the Return (or ENTER) key repeatedly and observe how the variable values change as the debugger goes through the data step.
  6. To close the debugger, enter q into the debugger's command line.

 

When the MODIFY statement of the data step is executed for the first time, the debugger log shows: class1 = 1, class2 = xxx and class3 is missing. The latter is logical because nothing has been retrieved from dataset BHS yet, and WTG with the rename option "class3 = class3_mese" does not populate CLASS3 either. There is no observation in BHS with class1 = 1, class2 = 'xxx' and class3 = ' '. I think, this is why no match is found and the DO-END block of the second WHEN statement is executed next. This, in turn, changes the value of CLASS3, which leads to a match at the next execution of the MODIFY statement ...

 

Maybe this (and the rest of the debugger log) sheds light on the inner workings of this somewhat convoluted data step.

 

Ask a Question
Discussion stats
  • 1 reply
  • 182 views
  • 0 likes
  • 2 in conversation