<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Set statement with tacit retain? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333499#M75145</link>
    <description>&lt;P&gt;Michael,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your next-to-last paragraph sums it up very nicely.&amp;nbsp; In a nutshell ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any variable that comes from a SAS data set is automatically retained.&amp;nbsp; It doesn't matter whether the data set is brought in with SET, MERGE, or UPDATE.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When should the software re-set a retained variable automatically?&amp;nbsp; In the case of a SET statement that references multiple data sets, the re-sets occur only on the first observation being read in from each data set.&amp;nbsp; (If a BY statement is being used, it gets a bit more complex.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Confirmed!&lt;/P&gt;</description>
    <pubDate>Thu, 16 Feb 2017 17:47:12 GMT</pubDate>
    <dc:creator>Astounding</dc:creator>
    <dc:date>2017-02-16T17:47:12Z</dc:date>
    <item>
      <title>Set statement with tacit retain?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333483#M75139</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I don't think I'm understanding the set statement correctly when dealing with multiple set datasets. &amp;nbsp;The issue seems to appear when a column that only exists in one of the datasets is reassigned a value. &amp;nbsp;Here is what seems like a simple example.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data a;
	infile datalines truncOver;
	input v;
	datalines;
1
2
 
 
 
3
 
4
5
;
run;

data b;
	input w;
	wCopy = w;
	datalines;
1
2
3
4
5
;
run;&lt;BR /&gt;&lt;BR /&gt;data AB;&lt;BR /&gt; set a b;&lt;BR /&gt; w2 = w;&lt;BR /&gt;&lt;BR /&gt; w = v;&lt;BR /&gt;run;
&lt;/PRE&gt;
&lt;P&gt;Variable v comes from dataset A only, and varaibles w and wCopy are identical and come from dataset B. &amp;nbsp;In the resultant dataset AB, column w seems as expected, because we assigned w = v. &amp;nbsp;Thus, w and v are identical now with the same values that came from v. &amp;nbsp;The variable wCopy shows what w used to look like and is just included for reference.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It's variable w2 that seems quite odd. &amp;nbsp;My expectation is that w2 would also be identical to wCopy, because we are setting w2 to the value of w&amp;nbsp;&lt;EM&gt;before&lt;/EM&gt; we are changing the value of w to equal v. &amp;nbsp;However, this is only the case for the records coming from dataset B, where w was already defined. &amp;nbsp;For the records in dataset A, where w did not exist, SAS seems to be retaining values of w from one record to the next. &amp;nbsp;For the first record, w does not exist, so w2 is missing. &amp;nbsp;Then w is set to v, so w for the first record now has a value of 1. &amp;nbsp;However, for the second record, I would also expect w not to exist, but it appears that w actually now exists, the value 1 retained from the previous record. &amp;nbsp;Since w = 1 (retained), now w2 is set to 1 and w is changed to v (2). &amp;nbsp;For the third record, the retained w value is now 2, etc. &amp;nbsp;This appears like a lag() function, but it seems to actually be retain, as seen in the second example.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is example two.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data a;
	infile datalines truncOver;
	input v;
	datalines;
1
2
 
 
 
3
 
4
5
;
run;

data b;
	format wCopy;
	input w;
	x     = w;
	wCopy = w;
	datalines;
1
2
3
4
5
;
run;

data ab2;
	set a b;

	if w = .
		then w = v;

	if v ^= .
		then x = v;

	if v ^= .
		then y = v;
		else y = w;

	if v ^= .
		then z = v;
		else z = wCopy;
run;
&lt;/PRE&gt;
&lt;P&gt;Variable v is untouched, and wCopy is the original values of w and x, which were identical in dataset B. &amp;nbsp;v and wCopy appear just as they would if the data step only had the statement set A B. &amp;nbsp;I can only make sense of the resultant dataset AB2 if there is somehow a tacit retain statement on w. &amp;nbsp;Column z is the only one that acts like I would intuitively expect.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Variable w itself gets assigned to v only when w is missing. &amp;nbsp;I would expect the results to be column z, essentially "patching" variable w with variable v; i.e. when w is missing, set it to v. &amp;nbsp;Instead, it seems to work correctly only for the first record. &amp;nbsp;w is missing, so it gets assigned the value of v (1). &amp;nbsp;However, it appears that this value of 1 is then retained for all other records coming from dataset A. &amp;nbsp;Thus for every other record from A, the condition w = . is false, because w has already been set to 1.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Variable x I would expect to just be replacing variable x with variable v, but only when v was not missing (essentially patching v with x). &amp;nbsp;Again, this phantom retain statement seems to strike. &amp;nbsp;For the records where v exists, all is as expected. &amp;nbsp;But the now assigned value for x is getting retained to subsequent records. &amp;nbsp;So even though I would expect a row with v missing to leave x untouched (essentially missing), x is getting the value retained from above.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Variable y is similar to x, except the else statement explicitly says to keep the value of w, but since w had been retained, those values are now patched into y.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Variable z is the only intuitive result. &amp;nbsp;It equals v&amp;nbsp;&lt;EM&gt;only&lt;/EM&gt; when that if statement is met. &amp;nbsp;Otherwise, it equals the original wCopy variable, which includes missing values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I think I might understand, but could someone verify that this is correct? &amp;nbsp;Say a SET statement has more than one dataset. &amp;nbsp;One of these datasets, Q, is missing a column that the others have. &amp;nbsp;SAS does not simply set this column to missing/null for the records coming from Q. &amp;nbsp;Instead, SAS determines the value of this column for the first record from Q, and then retains that value all the way until the last record of Q. &amp;nbsp;Usually it seems like all missings, because the first record is determined to be missing and then that is retained down the column.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sorry for being so long-winded, but this really threw my whole SAS group for a loop this morning, and the behavior was very not expected.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;Michael&lt;/P&gt;</description>
      <pubDate>Thu, 16 Feb 2017 17:21:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333483#M75139</guid>
      <dc:creator>Kastchei</dc:creator>
      <dc:date>2017-02-16T17:21:45Z</dc:date>
    </item>
    <item>
      <title>Re: Set statement with tacit retain?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333499#M75145</link>
      <description>&lt;P&gt;Michael,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your next-to-last paragraph sums it up very nicely.&amp;nbsp; In a nutshell ...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any variable that comes from a SAS data set is automatically retained.&amp;nbsp; It doesn't matter whether the data set is brought in with SET, MERGE, or UPDATE.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When should the software re-set a retained variable automatically?&amp;nbsp; In the case of a SET statement that references multiple data sets, the re-sets occur only on the first observation being read in from each data set.&amp;nbsp; (If a BY statement is being used, it gets a bit more complex.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Confirmed!&lt;/P&gt;</description>
      <pubDate>Thu, 16 Feb 2017 17:47:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333499#M75145</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2017-02-16T17:47:12Z</dc:date>
    </item>
    <item>
      <title>Re: Set statement with tacit retain?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333524#M75161</link>
      <description>&lt;P&gt;Yes, some variables are always tacitly retained (but even that statement is incomplete).&lt;BR /&gt;&lt;BR /&gt;The program data vector (PDV) can be subdivided into two kinds of variables:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;New vars - i.e. vars newly defined&amp;nbsp;within the data step.&amp;nbsp; In your case W2 is such a variable.&amp;nbsp; New vars (in the absence of a retain statement) are reset to missing at the beginning of each iteration of the data step.&lt;/LI&gt;
&lt;LI&gt;"Old" vars - i.e. var read in via a SET (or MERGE or UPDATE) statement.&amp;nbsp; These vars are automatically retained with each new iteration.&amp;nbsp; I.e. they are NOT reset to missing.&amp;nbsp; They are only changed when the &lt;FONT color="#ff0000"&gt;&lt;U&gt;&lt;EM&gt;&lt;STRONG&gt;appropriate&lt;/STRONG&gt;&lt;/EM&gt;&lt;/U&gt;&lt;/FONT&gt; SET/MERGE/UPDATE statement is encounted.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;As an experiment run these two programs, using modified datasets A and B.&amp;nbsp; The data modification assigns variable order=1 to 9 for the 9 records in A,&amp;nbsp; and order=101 to 105 for the observations in B.&amp;nbsp; Both programs will result in a dataset in all the A records precede all the B records.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first program is identical to your program, except that it reports&amp;nbsp;the value of W at three different points in the data step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data a;
	infile datalines truncOver;
	input v;
	order=_n_;
	datalines;
1
2
 
 
 
3
 
4
5
;
run;

data b;
	input w;
	wCopy = w;
	order=_n_+100;
	datalines;
1
2
3
4
5
;
run;

data AB;
 put _n_=z2. ' Before set A B: ' w= +2 @;
 set a b;
 put 'After set: ' w= +2 @;
 w2 = w;

 w = v;
 put 'After reassignment: ' w=;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The first 2 lines of the log includes this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;_N_=01 Before set A B: w=. After set: w=. After reassignment: w=1&lt;/P&gt;
&lt;P&gt;_N_=02 &lt;EM&gt;&lt;FONT color="#ff0000"&gt;Before set A B: w=1 After set: w=1 &lt;/FONT&gt;&lt;/EM&gt;After reassignment: w=2&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For _N_=2 notice that "before set that W=1, which was inherited (i.e. retained) from the "After reassignment" value produced in _N_=1.&lt;/P&gt;
&lt;P&gt;This is an explicit example of your observation that variable values from SET are automatically retained,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now the "after set" value is unchanged in _N_=2.&amp;nbsp; Take a look at the analogus results for _N_=9 and 10 (last A rec and first B rec):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;_N_=09 Before set A B: w=4 After set: w=. After reassignment: w=5&lt;/P&gt;
&lt;P&gt;_N_=10 Before set A B: w=5 After set: w=1 After reassignment: w=.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Again we see the "after reassignment" value from _N_=9 retained at the start of _N_=10.&amp;nbsp; But this time the "after set" value is changed, becuase it's the first record from B.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;NOW&amp;nbsp;... consider the example below, which is identical except for a "by order" statement.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data AB2;
 put _n_=z2. ' Before set A B: ' w= +2 @;
 set a b;
 by order;
 put 'After set: ' w= +2 @;
 w2 = w;

 w = v;
 put 'After reassignment: ' w=;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you look at dataset AB2, you'll find that WCOPY=W2 as you originally expected.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;So why does adding "by order" make&amp;nbsp;AB2 different from AB?&amp;nbsp;&amp;nbsp;&lt;/STRONG&gt; Because without the BY statement, SAS knows that it will process all the B records only after all the A records, which in turn appears to mean that the SET statement will not need to reset "dataset-B-only" variables while reading dataset A.&amp;nbsp; But introduction of the BY statement means that SAS at all times has to have in its buffer a record from both A and B, so that the "by order" statement can be honored.&amp;nbsp; This in turn appears to mean that the SET statement will reset ALL vars from all files, even when the record-in-hand originates from only one data set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Proof?&amp;nbsp; Look at the log now:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;_N_=01 Before set A B: w=. After set: w=. After reassignment: w=1&lt;/P&gt;
&lt;P&gt;_N_=02 Before set A B: w=1 After set: w=. After reassignment: w=2&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This time, for _N_=2 (an observation coming from dataset A), W is reset to missing after the SET statement, even though it was retain as W=1 prior to the set statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;U&gt;&lt;EM&gt;&lt;STRONG&gt;Conclusion:&amp;nbsp; The BY statement changes the set of vars modified by a SET statement.&lt;/STRONG&gt;&lt;/EM&gt;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;concatenated all the B records&lt;/P&gt;</description>
      <pubDate>Thu, 16 Feb 2017 18:35:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333524#M75161</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2017-02-16T18:35:23Z</dc:date>
    </item>
    <item>
      <title>Re: Set statement with tacit retain?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333600#M75206</link>
      <description>Thank you both for the information.  Incredibly helpful!</description>
      <pubDate>Thu, 16 Feb 2017 21:59:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Set-statement-with-tacit-retain/m-p/333600#M75206</guid>
      <dc:creator>Kastchei</dc:creator>
      <dc:date>2017-02-16T21:59:35Z</dc:date>
    </item>
  </channel>
</rss>

