BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

 

I would like to modify a dataset (master) with the data of another dataset (transaction) and, if the key matches go in replacement, otherwise append in the new data. The result, unexpected is that for cod_cli = 2 not only replaces data (correctly) but appends them also under:

 

data master (index=(cod_cli));
input cod_cli Quantity flag_iorc ;
datalines;
1 8 .
2 10 .
2 12 .
2 13 .
3 21 .
4 13 .
run;

data transaction;
input cod_cli   AddQuantity;
datalines;
2 7
5 12
run;

data master;  
set transaction;
do until (_iorc_=%sysrc(_dsenom)); /*ciclo*/
	modify master key=cod_cli;  
	select(_iorc_); 
	      when(%sysrc(_sok)) do;
		     Quantity = Quantity + AddQuantity;
			 flag_iorc = _iorc_;
	         replace;
	      end;
	      when(%sysrc(_dsenom)) do;
		      Quantity = AddQuantity;
	         output;
			 _error_=0;
	      end;
	      otherwise do;
	         put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
	         put 'Program terminating. DATA step iteration # ' _n_;
	         put _all_;
	         stop;
	      end;
	   end;
end;
run;

from the log results:

 

There were 3 observations rewritten, 2

      observations added and 0 observations deleted.

 

But I would expect 1 observations added..

 

thanks for help

1 ACCEPTED SOLUTION

Accepted Solutions
jklaverstijn
Rhodochrosite | Level 12

Trap for young players. Here is what I think is the issue. The code does not distinguish between "no keyes at all" and "no more keys" in the master table.

 

For key cod_cli=2 the master has three rows. The DO loop cycles through the master looking for potentially multiple occurences of the key. There are three rows in the master for that value. But the loop continues until a fourth lookup yield a DSENOM. This indicates there are no more rows with that key value. Your logic however has a case for that as well: it outputs a new row. Hence the extra row for cod_cli=2.

 

To further illustrate the issue add a row with cod_cli=3 in the transaction set. You will get an additional new row for that key as well.

 

I would set a flag if a key exists in the master and use that to prevent outputting new rows. See code below. But as usual there are probably many alternative approaches.

 

data master;
set transaction;
matched_flag='N';
do until (_iorc_=%sysrc(_dsenom));
        modify master key=cod_cli;
        select(_iorc_);
              when(%sysrc(_sok)) do;
                     Quantity = Quantity + AddQuantity;
                     matched_flag='Y';
                      _error_=0;
                     replace;
              end;
              when(%sysrc(_dsenom)) do;
                         _error_=0;
                         if matched_flag='N' then output;
              end;
              otherwise do;
                 put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
                 put 'Program terminating. DATA step iteration # ' _n_;                 
                 stop;
              end;
           end;
end;
run;

 

Regards,

- Jan.

 

View solution in original post

6 REPLIES 6
ballardw
Super User

You should provide what you expect the result of your modification to be.

mariopellegrini
Pyrite | Level 9

I aspected this result:


Immagine.jpg
mariopellegrini
Pyrite | Level 9

I want to obtain with one data step the end result of these two data step (3 observations rewritten,  1 observations added)

 

data master (index=(cod_cli));
input cod_cli Quantity;
datalines;
1 8
2 10
2 12
2 13
3 21
4 13
run;

data transaction;
input cod_cli   AddQuantity;
datalines;
2 7
5 12
run;

data master;  
set transaction;
do until (_iorc_=%sysrc(_dsenom));  
	modify master key=cod_cli;  
	select(_iorc_); 
	      when(%sysrc(_sok)) do;
		     Quantity = Quantity + AddQuantity;
			 _error_=0;
	         replace;
	      end;
		  when(%sysrc(_dsenom)) do;
			 _error_=0;
	      end;
	      otherwise do;
	         put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
	         put 'Program terminating. DATA step iteration # ' _n_;
	         put _all_;
	         stop;
	      end;
	   end;
end;
run;
data master; set transaction; modify master key=cod_cli; select(_iorc_); when(%sysrc(_sok)) do; end; when(%sysrc(_dsenom)) do; Quantity = AddQuantity; output; _error_=0; end; otherwise do; put 'ERROR: Unexpected value for _IORC_= ' _iorc_; put 'Program terminating. DATA step iteration # ' _n_; put _all_; stop; end; end; run;
jklaverstijn
Rhodochrosite | Level 12

Trap for young players. Here is what I think is the issue. The code does not distinguish between "no keyes at all" and "no more keys" in the master table.

 

For key cod_cli=2 the master has three rows. The DO loop cycles through the master looking for potentially multiple occurences of the key. There are three rows in the master for that value. But the loop continues until a fourth lookup yield a DSENOM. This indicates there are no more rows with that key value. Your logic however has a case for that as well: it outputs a new row. Hence the extra row for cod_cli=2.

 

To further illustrate the issue add a row with cod_cli=3 in the transaction set. You will get an additional new row for that key as well.

 

I would set a flag if a key exists in the master and use that to prevent outputting new rows. See code below. But as usual there are probably many alternative approaches.

 

data master;
set transaction;
matched_flag='N';
do until (_iorc_=%sysrc(_dsenom));
        modify master key=cod_cli;
        select(_iorc_);
              when(%sysrc(_sok)) do;
                     Quantity = Quantity + AddQuantity;
                     matched_flag='Y';
                      _error_=0;
                     replace;
              end;
              when(%sysrc(_dsenom)) do;
                         _error_=0;
                         if matched_flag='N' then output;
              end;
              otherwise do;
                 put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
                 put 'Program terminating. DATA step iteration # ' _n_;                 
                 stop;
              end;
           end;
end;
run;

 

Regards,

- Jan.

 

jklaverstijn
Rhodochrosite | Level 12

Come to think of it: if you have multiple keys in the master data set, how is one to tell that an existing key in the transaction dataset represents an update and not a new value? It occurred to me that this consideration is at the root of the issue. Only if the purpose of the transaction set is not to add duplicate key values my solution applies.

 

In real life I would expect an additional column in the data to make the key unique. I have learned that every table should have a unique key or the data model is flawed. I'm just saying.

mariopellegrini
Pyrite | Level 9

I got an unexpected result with this code:

 

 

data BHS;
input class1	class2:$50. class3:$40. value;
datalines;
1 xxx name1 0
1 xxx name2 0
run;
	
data WTG;
input class1	class2:$50.	class3:$40. value;
datalines;	
1 xxx name1 1
1 xxx name2 2
run;

proc datasets library = work nolist;
	 modify BHS;
	 index create ts_di2 = ( class1 class2 class3) / nomiss;
quit;

		
			data BHS;  
			matched_flag='N';
			set WTG (rename=(value=value_mese class3 = class3_mese));
			do until (_iorc_=%sysrc(_dsenom)); 
			modify BHS key=ts_di2;  
			select(_iorc_); 
				  when(%sysrc(_sok)) do;
					 value = value_mese;
					 class3 = class3_mese;
					 matched_flag='Y'; put "match"  _iorc_= _n_=;
					 _error_=0;
					 replace;
				  end;
				  when(%sysrc(_dsenom)) do;
					 value = value_mese;
					 class3 = class3_mese;
					 _error_ = 0;  
					 if matched_flag='N' then do;  
                         put "no_match"  _iorc_= _n_=;
                         output; 
                     end;
				  end;
				  otherwise do;
					 put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
					 put 'Program terminating. DATA step iteration # ' _n_;
					 put _all_;
					 stop;
				  end;
			   end;
			end;
			run;			

The result was:

 

"There were 3 observations rewritten, 1
observations added and 0 observations deleted."

Instead I expected  2 observations rewritten and 0 observations added, as there are no duplicates. I do not understand why in the first lap of the cycle enters in the "no match", as all match records

 

I expected this outcome for the table BHS:

data BHS;;
input class1 class2:$50. class3:$40. value;
datalines;
1 xxx name1 1
1 xxx name2 2
run;

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1503 views
  • 1 like
  • 3 in conversation