I would like to modify a dataset (master) with the data of another dataset (transaction) and, if the key matches go in replacement, otherwise append in the new data. The result, unexpected is that for cod_cli = 2 not only replaces data (correctly) but appends them also under:
data master (index=(cod_cli));
input cod_cli Quantity flag_iorc ;
datalines;
1 8 .
2 10 .
2 12 .
2 13 .
3 21 .
4 13 .
run;
data transaction;
input cod_cli AddQuantity;
datalines;
2 7
5 12
run;
data master;
set transaction;
do until (_iorc_=%sysrc(_dsenom)); /*ciclo*/
modify master key=cod_cli;
select(_iorc_);
when(%sysrc(_sok)) do;
Quantity = Quantity + AddQuantity;
flag_iorc = _iorc_;
replace;
end;
when(%sysrc(_dsenom)) do;
Quantity = AddQuantity;
output;
_error_=0;
end;
otherwise do;
put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
put 'Program terminating. DATA step iteration # ' _n_;
put _all_;
stop;
end;
end;
end;
run;
from the log results:
There were 3 observations rewritten, 2
observations added and 0 observations deleted.
But I would expect 1 observations added..
thanks for help
Trap for young players. Here is what I think is the issue. The code does not distinguish between "no keyes at all" and "no more keys" in the master table.
For key cod_cli=2 the master has three rows. The DO loop cycles through the master looking for potentially multiple occurences of the key. There are three rows in the master for that value. But the loop continues until a fourth lookup yield a DSENOM. This indicates there are no more rows with that key value. Your logic however has a case for that as well: it outputs a new row. Hence the extra row for cod_cli=2.
To further illustrate the issue add a row with cod_cli=3 in the transaction set. You will get an additional new row for that key as well.
I would set a flag if a key exists in the master and use that to prevent outputting new rows. See code below. But as usual there are probably many alternative approaches.
data master;
set transaction;
matched_flag='N';
do until (_iorc_=%sysrc(_dsenom));
modify master key=cod_cli;
select(_iorc_);
when(%sysrc(_sok)) do;
Quantity = Quantity + AddQuantity;
matched_flag='Y';
_error_=0;
replace;
end;
when(%sysrc(_dsenom)) do;
_error_=0;
if matched_flag='N' then output;
end;
otherwise do;
put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
put 'Program terminating. DATA step iteration # ' _n_;
stop;
end;
end;
end;
run;
Regards,
- Jan.
You should provide what you expect the result of your modification to be.
I aspected this result:
I want to obtain with one data step the end result of these two data step (3 observations rewritten, 1 observations added)
data master (index=(cod_cli));
input cod_cli Quantity;
datalines;
1 8
2 10
2 12
2 13
3 21
4 13
run;
data transaction;
input cod_cli AddQuantity;
datalines;
2 7
5 12
run;
data master;
set transaction;
do until (_iorc_=%sysrc(_dsenom));
modify master key=cod_cli;
select(_iorc_);
when(%sysrc(_sok)) do;
Quantity = Quantity + AddQuantity;
_error_=0;
replace;
end;
when(%sysrc(_dsenom)) do;
_error_=0;
end;
otherwise do;
put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
put 'Program terminating. DATA step iteration # ' _n_;
put _all_;
stop;
end;
end;
end;
run;
data master;
set transaction;
modify master key=cod_cli;
select(_iorc_);
when(%sysrc(_sok)) do;
end;
when(%sysrc(_dsenom)) do;
Quantity = AddQuantity;
output;
_error_=0;
end;
otherwise do;
put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
put 'Program terminating. DATA step iteration # ' _n_;
put _all_;
stop;
end;
end;
run;
Trap for young players. Here is what I think is the issue. The code does not distinguish between "no keyes at all" and "no more keys" in the master table.
For key cod_cli=2 the master has three rows. The DO loop cycles through the master looking for potentially multiple occurences of the key. There are three rows in the master for that value. But the loop continues until a fourth lookup yield a DSENOM. This indicates there are no more rows with that key value. Your logic however has a case for that as well: it outputs a new row. Hence the extra row for cod_cli=2.
To further illustrate the issue add a row with cod_cli=3 in the transaction set. You will get an additional new row for that key as well.
I would set a flag if a key exists in the master and use that to prevent outputting new rows. See code below. But as usual there are probably many alternative approaches.
data master;
set transaction;
matched_flag='N';
do until (_iorc_=%sysrc(_dsenom));
modify master key=cod_cli;
select(_iorc_);
when(%sysrc(_sok)) do;
Quantity = Quantity + AddQuantity;
matched_flag='Y';
_error_=0;
replace;
end;
when(%sysrc(_dsenom)) do;
_error_=0;
if matched_flag='N' then output;
end;
otherwise do;
put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
put 'Program terminating. DATA step iteration # ' _n_;
stop;
end;
end;
end;
run;
Regards,
- Jan.
Come to think of it: if you have multiple keys in the master data set, how is one to tell that an existing key in the transaction dataset represents an update and not a new value? It occurred to me that this consideration is at the root of the issue. Only if the purpose of the transaction set is not to add duplicate key values my solution applies.
In real life I would expect an additional column in the data to make the key unique. I have learned that every table should have a unique key or the data model is flawed. I'm just saying.
I got an unexpected result with this code:
data BHS;
input class1 class2:$50. class3:$40. value;
datalines;
1 xxx name1 0
1 xxx name2 0
run;
data WTG;
input class1 class2:$50. class3:$40. value;
datalines;
1 xxx name1 1
1 xxx name2 2
run;
proc datasets library = work nolist;
modify BHS;
index create ts_di2 = ( class1 class2 class3) / nomiss;
quit;
data BHS;
matched_flag='N';
set WTG (rename=(value=value_mese class3 = class3_mese));
do until (_iorc_=%sysrc(_dsenom));
modify BHS key=ts_di2;
select(_iorc_);
when(%sysrc(_sok)) do;
value = value_mese;
class3 = class3_mese;
matched_flag='Y'; put "match" _iorc_= _n_=;
_error_=0;
replace;
end;
when(%sysrc(_dsenom)) do;
value = value_mese;
class3 = class3_mese;
_error_ = 0;
if matched_flag='N' then do;
put "no_match" _iorc_= _n_=;
output;
end;
end;
otherwise do;
put 'ERROR: Unexpected value for _IORC_= ' _iorc_;
put 'Program terminating. DATA step iteration # ' _n_;
put _all_;
stop;
end;
end;
end;
run;
The result was:
"There were 3 observations rewritten, 1
observations added and 0 observations deleted."
Instead I expected 2 observations rewritten and 0 observations added, as there are no duplicates. I do not understand why in the first lap of the cycle enters in the "no match", as all match records
I expected this outcome for the table BHS:
data BHS;;
input class1 class2:$50. class3:$40. value;
datalines;
1 xxx name1 1
1 xxx name2 2
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.