BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
expertyejin
Obsidian | Level 7

Hello SAS community,

 

I came across those strange issues as I was testing mnemonics used to assess the outcome of the MODIFY statement within a DATA step in SAS Studio. Below is my code used to create two identical master datasets and one transaction dataset:

 

data master1 master2;
input a b;
datalines;
0 1
2 3
4 5
6 7
8 9
;
run;

data transaction;
input a b;
datalines;
0 9
1 8
1 7
1 6
2 5
3 4
3 3
3 2
4 1
;
run;

Starting with master data1, I intended to update the values in this dataset with matching records as well as append the 1st non-matching records from the transaction dataset:

 

data master1;
modify master1 transaction;
by a;
if _iorc_=%sysrc(_sok) then replace;
else if _iorc_=%sysrc(_dsenmr) then do;
	put "First non-matching observation found!";
	output;
	_error_=0;
end;
else if _iorc_=%sysrc(_dsemtr) then do;
	put "Subsequent non-matching observations found!";
	_error_=0;
end;
run;

proc print data=master1;
run;

However, here is the message I get from the SAS log when I ran the code above:

 

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 55         
 56         data master1;
 57         modify master1 transaction;
 58         by a;
 59         if _iorc_=%sysrc(_sok) then replace;
 60         else if _iorc_=%sysrc(_dsenmr) then do;
 61         put "First non-matching observation found!";
 62         output;
 63         _error_=0;
 64         end;
 65         else if _iorc_=%sysrc(_dsemtr) then do;
 66         put "Subsequent non-matching observations found!";
 67         _error_=0;
 68         end;
 69         run;
 
 First non-matching observation found!
 First non-matching observation found!
 NOTE: There were 1 observations read from the data set WORK.MASTER1.
 NOTE: The data set WORK.MASTER1 has been updated.  There were 7 observations rewritten, 2 observations added and 0 observations 
       deleted.
 NOTE: There were 9 observations read from the data set WORK.TRANSACTION.
 NOTE: DATA statement used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              1317.43k
       OS Memory           28844.00k
       Timestamp           10/20/2016 11:29:58 PM
       Step Count                        60  Switch Count  82
       Page Faults                       0
       Page Reclaims                     452
       Page Swaps                        0
       Voluntary Context Switches        273
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           280
       
 
 70         
 71         proc print data=master1;
 72         run;
 
 NOTE: There were 7 observations read from the data set WORK.MASTER1.
 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.02 seconds
       user cpu time       0.03 seconds
       system cpu time     0.00 seconds
       memory              1608.50k
       OS Memory           28836.00k
       Timestamp           10/20/2016 11:29:58 PM
       Step Count                        61  Switch Count  24
       Page Faults                       0
       Page Reclaims                     124
       Page Swaps                        0
       Voluntary Context Switches        42
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           8
       
 
 73         
 74         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 86         

As you can see, only the first PUT statement displays its message to the log, but not the second one. Why is that? Isn't this weird?

 

Moreover, this isn't the only strange thing that happened. In fact, here is the output:

 

Obs 	a 	b
1 	0 	9
2 	2 	5
3 	4 	1
4 	6 	7
5 	8 	9
6 	1 	6
7 	3 	2

 Since I chose to get the first non-matching record from the transaction dataset (i.e. the "_dsenmr" mnemonic code), aren't the last two lines from the output supposed to be "6 1 8" and "7 3 4"??

 

Now, with master data 2, I used the same approach except that I wanted to append subsequent non-matching records instead of the first one:

 

data master2;
modify master2 transaction;
by a;
if _iorc_=%sysrc(_sok) then replace;
else if _iorc_=%sysrc(_dsenmr) then do;
	put "First non-matching observation found!";
	_error_=0;
end;
else if _iorc_=%sysrc(_dsemtr) then do;
	put "Subsequent non-matching observations found!";
	output;
	_error_=0;
end;
run;

proc print data=master2;
run;

After submitting the code above, I got the correct message in the log. However, the output still worries me:

 

Obs 	a 	b
1 	0 	9
2 	2 	5
3 	4 	1
4 	6 	7
5 	8 	9
6 	1 	6
7 	3 	2

This is also strange because the "_dsemtr" mnemonic code implies getting ALL non-matching observations other than the first one, which means that I'm missing rows where "a=1 and b=7" and "a=3 and b=3".

 

I'm really stuck on these issues and hopefully someone could help me out on this.

 

Thanks a bunch!!

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Took me twice posting a wrong answer until I've got it. I believe @Oligolas has indirectly given the correct answer already - I just didn't understand it...

 

So what happens here:

All changes to the master table are "commited" immediately. For non-matching records the first time you issue an OUTPUT the record gets written to the master and then for the next record from the transaction dataset in the same by group there is a match.

 

The mnemonics are not useless as such but they reflect how the modify with by group processing works.

 

If you run your code without an output statement you'll get the values as you did expected them.

data master1;
  modify master1 transaction;
  by a;

  put / _n_=;
  put _iorc_= "_sok: %sysrc(_sok)  _dsenmr: %sysrc(_dsenmr)  _dsemtr: %sysrc(_dsemtr)";
  put _all_;

  if _iorc_=%sysrc(_sok) then
    replace;
  else if _iorc_=%sysrc(_dsenmr) then
    do;
      put "First non-matching observation found!";
/*      output;*/
      _error_=0;
    end;
  else if _iorc_=%sysrc(_dsemtr) then
    do;
      put "Subsequent non-matching observations found!";
      _error_=0;
    end;
run;

_N_=1
_IORC_=0 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=0 b=9 FIRST.a=1 LAST.a=1 _ERROR_=0 _IORC_=0 _N_=1

_N_=2
_IORC_=1230013 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=8 FIRST.a=1 LAST.a=0 _ERROR_=1 _IORC_=1230013 _N_=2
First non-matching observation found!

_N_=3
_IORC_=1230014 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=7 FIRST.a=0 LAST.a=0 _ERROR_=1 _IORC_=1230014 _N_=3
Subsequent non-matching observations found!

 

 

For what you want to do, you could use some code as below:

data master1;
  modify master1 transaction;
  by a;

  retain non_match_flg;
  if first.a then non_match_flg=0;

  if _iorc_=%sysrc(_sok) and non_match_flg=0 then
    replace;
  else if _iorc_=%sysrc(_dsenmr) then
    do;
      put "First non-matching observation found!";
      output;
      _error_=0;
      non_match_flg=1;
    end;
  else if non_match_flg=1 then
    do;
      put "Subsequent non-matching observations found!";
      _error_=0;
    end;
run;

First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!
First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!

View solution in original post

9 REPLIES 9
Oligolas
Barite | Level 11

Hi expertyejin

 

the output statement behaves differently if used in combination with the modify statement.

 

"If a MODIFY statement is present, OUTPUT with no arguments writes the current observation 
to the end of the data set that is specified in the MODIFY statement."

 

So a=1 is going to be checked, it outputs the row at the end of master1, same for a=3

then it goes through transaction for a match replace:

- for a=1 it replaces b as long as matches are found, first with 8, then with 7 and finally with 6

- same procedure for a=3, first 4, then 3 and finally 2

 

I Hope everything was understandable so far.

Cheers,

Oligolas

________________________

- Cheers -

expertyejin
Obsidian | Level 7

Hi Oligolas,

 

Thanks for the reply. Are you basically suggesting that this is somehow similar to the UNIQUE option in a MODIFY statement, and that those mnemonics are completely useless as it's impossible to yield the output I want? Also, how come the second PUT statement is not showing the message in the SAS log?

 

expertyejin

Patrick
Opal | Level 21

Took me twice posting a wrong answer until I've got it. I believe @Oligolas has indirectly given the correct answer already - I just didn't understand it...

 

So what happens here:

All changes to the master table are "commited" immediately. For non-matching records the first time you issue an OUTPUT the record gets written to the master and then for the next record from the transaction dataset in the same by group there is a match.

 

The mnemonics are not useless as such but they reflect how the modify with by group processing works.

 

If you run your code without an output statement you'll get the values as you did expected them.

data master1;
  modify master1 transaction;
  by a;

  put / _n_=;
  put _iorc_= "_sok: %sysrc(_sok)  _dsenmr: %sysrc(_dsenmr)  _dsemtr: %sysrc(_dsemtr)";
  put _all_;

  if _iorc_=%sysrc(_sok) then
    replace;
  else if _iorc_=%sysrc(_dsenmr) then
    do;
      put "First non-matching observation found!";
/*      output;*/
      _error_=0;
    end;
  else if _iorc_=%sysrc(_dsemtr) then
    do;
      put "Subsequent non-matching observations found!";
      _error_=0;
    end;
run;

_N_=1
_IORC_=0 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=0 b=9 FIRST.a=1 LAST.a=1 _ERROR_=0 _IORC_=0 _N_=1

_N_=2
_IORC_=1230013 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=8 FIRST.a=1 LAST.a=0 _ERROR_=1 _IORC_=1230013 _N_=2
First non-matching observation found!

_N_=3
_IORC_=1230014 _sok: 0 _dsenmr: 1230013 _dsemtr: 1230014
a=1 b=7 FIRST.a=0 LAST.a=0 _ERROR_=1 _IORC_=1230014 _N_=3
Subsequent non-matching observations found!

 

 

For what you want to do, you could use some code as below:

data master1;
  modify master1 transaction;
  by a;

  retain non_match_flg;
  if first.a then non_match_flg=0;

  if _iorc_=%sysrc(_sok) and non_match_flg=0 then
    replace;
  else if _iorc_=%sysrc(_dsenmr) then
    do;
      put "First non-matching observation found!";
      output;
      _error_=0;
      non_match_flg=1;
    end;
  else if non_match_flg=1 then
    do;
      put "Subsequent non-matching observations found!";
      _error_=0;
    end;
run;

First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!
First non-matching observation found!
Subsequent non-matching observations found!
Subsequent non-matching observations found!

Oligolas
Barite | Level 11

My understanding is, that since the output works in a different way with the MODIFY statement, we can not expect to get the same results as with the SET statement.

 

The notifications are not appearing precisely because of the output statement in this context. The mnemonic operators are fine, it is output in this context that makes troubles.

 

The SAS help says:

  • If a MODIFY statement is present, OUTPUT with no arguments writes the current observation to the end of the data set that is specified in the MODIFY statement.
  • When Using the MODIFY Statement: When you use the MODIFY statement with the OUTPUT statement, the REMOVE and REPLACE statements override the implicit write action at the end of each DATA step iteration. See Comparisons for more information. If both the OUTPUT statement and a REPLACE or REMOVE statement execute on a given observation, perform the output action last to keep the position of the observation pointer correct.

Looks like we can not get it all done in one run. You are trying to do a kind of full outer join and I doubt if this is possible with MODIFY (maybe contact the SAS support to get their advice)

For my part I prefer doing a full outer join in a first run and program the notifications in a second run.

PROC SQL;
   CREATE TABLE example AS
      SELECT coalesce(t1.a,
                      t2.a) AS a,
             t1.a AS a_master1,
             t2.a AS a_transaction,
             coalesce(t1.b,
                      t2.b) AS b
      FROM master1 t1
      FULL OUTER JOIN TRANSACTION t2 ON t1.a eq t2.a
   ;
QUIT;
data _NULL_;
set example;
if missing(a_master1) then put '...';
...
run;

 

 Cheers,

Oligolas

 

________________________

- Cheers -

Patrick
Opal | Level 21

@Oligolas

The issue is not caused by the record being written at the end of the file but that this write process happens immediately. For the next row read from the transaction data set the row from the previous record already exists in the master and therefore the logic branches to the MATCH case (when there are multiple records per by group in the transaction data set).

 

What the OP wants to achieve can be done and I've posted the code version for this already.

expertyejin
Obsidian | Level 7

@Patrick @Oligolas Wow you guys are wonderful! Thank you so much for your precious help, I am truly grateful!

 

Patrick, do you think it is possible to get the desired result for dataset master2 using the same approach? I've tried many times but always failed.

 

Regards,

 

expertyejin

Patrick
Opal | Level 21

The same approach should work. You just need to get the logic right.

data master2;
  modify master2 transaction;
  by a;

  retain non_match_flg;
  if first.a then non_match_flg=0;

  if _iorc_=%sysrc(_sok) and non_match_flg=0 then
    replace;
  else if _iorc_=%sysrc(_dsenmr) and non_match_flg=0 then
    do;
      put "First non-matching observation found!";
      _error_=0;
       non_match_flg=1;
    end;
  else if non_match_flg=1 then
    do;
      put "Subsequent non-matching observations found!";
      output;
      _error_=0;
    end;
run;
expertyejin
Obsidian | Level 7

Thank you once again Patrick!Smiley Happy I guess that in the end only the _dsenmr mnemonic operator and not _dsemtr needs to be used for this purpose, right? I'm saying this because this is a variation of your code where _dsemtr shows up first, but it does not work:

 

data master2;
  modify master2 transaction;
  by a;

  retain non_match_flg;
  if first.a=0 then non_match_flg=0;
  
  if _iorc_=%sysrc(_sok) then replace;
  else if _iorc_=%sysrc(_dsemtr) and non_match_flg=0 then
    do;
      put "Subsequent non-matching observation found!";
      output;
      _error_=0;
    end;
  else if _iorc_=%sysrc(_dsenmr) then
    do;
      non_match_flg=1;
      put "First non-matching observations found!";
      _error_=0;
    end;
run;
Patrick
Opal | Level 21

Just write the mnemonic values to the log and it gets very easy to define the logic without any guess work.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1926 views
  • 2 likes
  • 3 in conversation