Hello SAS community,
I came across those strange issues as I was testing mnemonics used to assess the outcome of the MODIFY statement within a DATA step in SAS Studio. Below is my code used to create two identical master datasets and one transaction dataset:
data master1 master2;
input a b;
datalines;
0 1
2 3
4 5
6 7
8 9
;
run;
data transaction;
input a b;
datalines;
0 9
1 8
1 7
1 6
2 5
3 4
3 3
3 2
4 1
;
run;
Starting with master data1, I intended to update the values in this dataset with matching records as well as append the 1st non-matching records from the transaction dataset:
data master1;
modify master1 transaction;
by a;
if _iorc_=%sysrc(_sok) then replace;
else if _iorc_=%sysrc(_dsenmr) then do;
put "First non-matching observation found!";
output;
_error_=0;
end;
else if _iorc_=%sysrc(_dsemtr) then do;
put "Subsequent non-matching observations found!";
_error_=0;
end;
run;
proc print data=master1;
run;
However, here is the message I get from the SAS log when I ran the code above:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 55 56 data master1; 57 modify master1 transaction; 58 by a; 59 if _iorc_=%sysrc(_sok) then replace; 60 else if _iorc_=%sysrc(_dsenmr) then do; 61 put "First non-matching observation found!"; 62 output; 63 _error_=0; 64 end; 65 else if _iorc_=%sysrc(_dsemtr) then do; 66 put "Subsequent non-matching observations found!"; 67 _error_=0; 68 end; 69 run; First non-matching observation found! First non-matching observation found! NOTE: There were 1 observations read from the data set WORK.MASTER1. NOTE: The data set WORK.MASTER1 has been updated. There were 7 observations rewritten, 2 observations added and 0 observations deleted. NOTE: There were 9 observations read from the data set WORK.TRANSACTION. NOTE: DATA statement used (Total process time): real time 0.00 seconds user cpu time 0.01 seconds system cpu time 0.00 seconds memory 1317.43k OS Memory 28844.00k Timestamp 10/20/2016 11:29:58 PM Step Count 60 Switch Count 82 Page Faults 0 Page Reclaims 452 Page Swaps 0 Voluntary Context Switches 273 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 280 70 71 proc print data=master1; 72 run; NOTE: There were 7 observations read from the data set WORK.MASTER1. NOTE: PROCEDURE PRINT used (Total process time): real time 0.02 seconds user cpu time 0.03 seconds system cpu time 0.00 seconds memory 1608.50k OS Memory 28836.00k Timestamp 10/20/2016 11:29:58 PM Step Count 61 Switch Count 24 Page Faults 0 Page Reclaims 124 Page Swaps 0 Voluntary Context Switches 42 Involuntary Context Switches 0 Block Input Operations 0 Block Output Operations 8 73 74 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 86
As you can see, only the first PUT statement displays its message to the log, but not the second one. Why is that? Isn't this weird?
Moreover, this isn't the only strange thing that happened. In fact, here is the output:
Obs a b 1 0 9 2 2 5 3 4 1 4 6 7 5 8 9 6 1 6 7 3 2
Since I chose to get the first non-matching record from the transaction dataset (i.e. the "_dsenmr" mnemonic code), aren't the last two lines from the output supposed to be "6 1 8" and "7 3 4"??
Now, with master data 2, I used the same approach except that I wanted to append subsequent non-matching records instead of the first one:
data master2;
modify master2 transaction;
by a;
if _iorc_=%sysrc(_sok) then replace;
else if _iorc_=%sysrc(_dsenmr) then do;
put "First non-matching observation found!";
_error_=0;
end;
else if _iorc_=%sysrc(_dsemtr) then do;
put "Subsequent non-matching observations found!";
output;
_error_=0;
end;
run;
proc print data=master2;
run;
After submitting the code above, I got the correct message in the log. However, the output still worries me:
Obs a b 1 0 9 2 2 5 3 4 1 4 6 7 5 8 9 6 1 6 7 3 2
This is also strange because the "_dsemtr" mnemonic code implies getting ALL non-matching observations other than the first one, which means that I'm missing rows where "a=1 and b=7" and "a=3 and b=3".
I'm really stuck on these issues and hopefully someone could help me out on this.
Thanks a bunch!!
Took me twice posting a wrong answer until I've got it. I believe @Oligolas has indirectly given the correct answer already - I just didn't understand it...
So what happens here:
All changes to the master table are "commited" immediately. For non-matching records the first time you issue an OUTPUT the record gets written to the master and then for the next record from the transaction dataset in the same by group there is a match.
The mnemonics are not useless as such but they reflect how the modify with by group processing works.
If you run your code without an output statement you'll get the values as you did expected them.
data master1;
modify master1 transaction;
by a;
put / _n_=;
put _iorc_= "_sok: %sysrc(_sok) _dsenmr: %sysrc(_dsenmr) _dsemtr: %sysrc(_dsemtr)";
put _all_;
if _iorc_=%sysrc(_sok) then
replace;
else if _iorc_=%sysrc(_dsenmr) then
do;
put "First non-matching observation found!";
/* output;*/
_error_=0;
end;
else if _iorc_=%sysrc(_dsemtr) then
do;
put "Subsequent non-matching observations found!";
_error_=0;
end;
run;
_N_=1
_IORC_=0 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=0 b=9 FIRST.a=1 LAST.a=1 _ERROR_=0 _IORC_=0 _N_=1
_N_=2
_IORC_=1230013 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=8 FIRST.a=1 LAST.a=0 _ERROR_=1 _IORC_=1230013 _N_=2
First non-matching observation found!
_N_=3
_IORC_=1230014 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=7 FIRST.a=0 LAST.a=0 _ERROR_=1 _IORC_=1230014 _N_=3
Subsequent non-matching observations found!
For what you want to do, you could use some code as below:
data master1;
modify master1 transaction;
by a;
retain non_match_flg;
if first.a then non_match_flg=0;
if _iorc_=%sysrc(_sok) and non_match_flg=0 then
replace;
else if _iorc_=%sysrc(_dsenmr) then
do;
put "First non-matching observation found!";
output;
_error_=0;
non_match_flg=1;
end;
else if non_match_flg=1 then
do;
put "Subsequent non-matching observations found!";
_error_=0;
end;
run;
First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!
First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!
Hi expertyejin
the output statement behaves differently if used in combination with the modify statement.
"If a MODIFY statement is present, OUTPUT with no arguments writes the current observation
to the end of the data set that is specified in the MODIFY statement."
So a=1 is going to be checked, it outputs the row at the end of master1, same for a=3
then it goes through transaction for a match replace:
- for a=1 it replaces b as long as matches are found, first with 8, then with 7 and finally with 6
- same procedure for a=3, first 4, then 3 and finally 2
I Hope everything was understandable so far.
Cheers,
Oligolas
- Cheers -
Hi Oligolas,
Thanks for the reply. Are you basically suggesting that this is somehow similar to the UNIQUE option in a MODIFY statement, and that those mnemonics are completely useless as it's impossible to yield the output I want? Also, how come the second PUT statement is not showing the message in the SAS log?
expertyejin
Took me twice posting a wrong answer until I've got it. I believe @Oligolas has indirectly given the correct answer already - I just didn't understand it...
So what happens here:
All changes to the master table are "commited" immediately. For non-matching records the first time you issue an OUTPUT the record gets written to the master and then for the next record from the transaction dataset in the same by group there is a match.
The mnemonics are not useless as such but they reflect how the modify with by group processing works.
If you run your code without an output statement you'll get the values as you did expected them.
data master1;
modify master1 transaction;
by a;
put / _n_=;
put _iorc_= "_sok: %sysrc(_sok) _dsenmr: %sysrc(_dsenmr) _dsemtr: %sysrc(_dsemtr)";
put _all_;
if _iorc_=%sysrc(_sok) then
replace;
else if _iorc_=%sysrc(_dsenmr) then
do;
put "First non-matching observation found!";
/* output;*/
_error_=0;
end;
else if _iorc_=%sysrc(_dsemtr) then
do;
put "Subsequent non-matching observations found!";
_error_=0;
end;
run;
_N_=1
_IORC_=0 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=0 b=9 FIRST.a=1 LAST.a=1 _ERROR_=0 _IORC_=0 _N_=1
_N_=2
_IORC_=1230013 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=8 FIRST.a=1 LAST.a=0 _ERROR_=1 _IORC_=1230013 _N_=2
First non-matching observation found!
_N_=3
_IORC_=1230014 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=7 FIRST.a=0 LAST.a=0 _ERROR_=1 _IORC_=1230014 _N_=3
Subsequent non-matching observations found!
For what you want to do, you could use some code as below:
data master1;
modify master1 transaction;
by a;
retain non_match_flg;
if first.a then non_match_flg=0;
if _iorc_=%sysrc(_sok) and non_match_flg=0 then
replace;
else if _iorc_=%sysrc(_dsenmr) then
do;
put "First non-matching observation found!";
output;
_error_=0;
non_match_flg=1;
end;
else if non_match_flg=1 then
do;
put "Subsequent non-matching observations found!";
_error_=0;
end;
run;
First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!
First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!
My understanding is, that since the output works in a different way with the MODIFY statement, we can not expect to get the same results as with the SET statement.
The notifications are not appearing precisely because of the output statement in this context. The mnemonic operators are fine, it is output in this context that makes troubles.
The SAS help says:
Looks like we can not get it all done in one run. You are trying to do a kind of full outer join and I doubt if this is possible with MODIFY (maybe contact the SAS support to get their advice)
For my part I prefer doing a full outer join in a first run and program the notifications in a second run.
PROC SQL;
CREATE TABLE example AS
SELECT coalesce(t1.a,
t2.a) AS a,
t1.a AS a_master1,
t2.a AS a_transaction,
coalesce(t1.b,
t2.b) AS b
FROM master1 t1
FULL OUTER JOIN TRANSACTION t2 ON t1.a eq t2.a
;
QUIT;
data _NULL_;
set example;
if missing(a_master1) then put '...';
...
run;
Cheers,
Oligolas
- Cheers -
The issue is not caused by the record being written at the end of the file but that this write process happens immediately. For the next row read from the transaction data set the row from the previous record already exists in the master and therefore the logic branches to the MATCH case (when there are multiple records per by group in the transaction data set).
What the OP wants to achieve can be done and I've posted the code version for this already.
The same approach should work. You just need to get the logic right.
data master2;
modify master2 transaction;
by a;
retain non_match_flg;
if first.a then non_match_flg=0;
if _iorc_=%sysrc(_sok) and non_match_flg=0 then
replace;
else if _iorc_=%sysrc(_dsenmr) and non_match_flg=0 then
do;
put "First non-matching observation found!";
_error_=0;
non_match_flg=1;
end;
else if non_match_flg=1 then
do;
put "Subsequent non-matching observations found!";
output;
_error_=0;
end;
run;
Thank you once again Patrick! I guess that in the end only the _dsenmr mnemonic operator and not _dsemtr needs to be used for this purpose, right? I'm saying this because this is a variation of your code where _dsemtr shows up first, but it does not work:
data master2;
modify master2 transaction;
by a;
retain non_match_flg;
if first.a=0 then non_match_flg=0;
if _iorc_=%sysrc(_sok) then replace;
else if _iorc_=%sysrc(_dsemtr) and non_match_flg=0 then
do;
put "Subsequent non-matching observation found!";
output;
_error_=0;
end;
else if _iorc_=%sysrc(_dsenmr) then
do;
non_match_flg=1;
put "First non-matching observations found!";
_error_=0;
end;
run;
Just write the mnemonic values to the log and it gets very easy to define the logic without any guess work.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.