<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: SET Statement with multiple datasets and missing columns in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794503#M254757</link>
    <description>&lt;P&gt;I wouldn't say it is a severe pitfall. If one understands the behaviour it's natural way to behave. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
    <pubDate>Fri, 04 Feb 2022 13:45:33 GMT</pubDate>
    <dc:creator>yabwon</dc:creator>
    <dc:date>2022-02-04T13:45:33Z</dc:date>
    <item>
      <title>SET Statement with multiple datasets and missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794463#M254749</link>
      <description>&lt;P&gt;We have encountered a strange behaviour in the data step using a SET statement with multiple datasets.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The situation is as follows:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data ds1;
  x = 1;
  y = 1;
  output;
  x = 2;
  y = .;
  output;
  x = 3;
  y = 3;
  output;
run;

data ds2;
  x = 4;
  output;
  x = 5;
  output;
  x = 6;
  output;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;ds1 has two columns x and y whereas ds only has the x column.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the next step, I want to concatenate the two datasets into one and replace missing values in column y with values from x.&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;data ds_combined;
  set ds1 ds2;
  if missing(y) then
    y = x;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;The result looks like this&lt;/P&gt;&lt;PRE&gt;x y
1 1
2 2
3 3
4 4
5 4
6 4&lt;/PRE&gt;&lt;P&gt;which is not what I expected.&lt;/P&gt;&lt;P&gt;I narrowed the issue down to the fact that the y column is in ds1 but not ds2. Everything works as expected&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;if y is set to missing explicitly in ds2 or&lt;/LI&gt;&lt;LI&gt;if concatenation and replacement of missing values happen in two different data steps.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;My understanding was that I can use multiple datasets in one SET statement and process the rows as if they come from just one dataset. Apperently this is not the case. Can anyone explain why this happens? Thanks in advance!&lt;/P&gt;</description>
      <pubDate>Fri, 04 Feb 2022 12:52:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794463#M254749</guid>
      <dc:creator>msauer</dc:creator>
      <dc:date>2022-02-04T12:52:25Z</dc:date>
    </item>
    <item>
      <title>Re: SET Statement with multiple datasets and missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794475#M254750</link>
      <description>&lt;P&gt;You don't clearly state what you WANT the output to look like. But if the goal is to UPDATE missing values in one dataset with the values from another, you may want to look at the UPDATE statement. And for that, your data sets really should have an ID so you can tell WHICH row needs updating. Try something like this, maybe?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data ds1;
  ID+1;
  x = 1;
  y = 1;
  output;
  ID+1;
  x = 2;
  y = .;
  output;
  ID+1;
  x = 3;
  y = 3;
  output;
run;

/* Create an update dataset for those rows with missing values */
data ds2;
   set ds1;
   where y is missing;
   y=x;
run;

/* Update the missing valuse */
data ds_combined;
   update ds1 ds2;
   by ID;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 04 Feb 2022 13:20:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794475#M254750</guid>
      <dc:creator>SASJedi</dc:creator>
      <dc:date>2022-02-04T13:20:55Z</dc:date>
    </item>
    <item>
      <title>Re: SET Statement with multiple datasets and missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794479#M254752</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;yes it's one of the severe pitfalls of SAS.&lt;/P&gt;
&lt;P&gt;SAS describes it as 'automatic retain' &lt;A href="http://support.sas.com/kb/48/147.html" target="_blank" rel="noopener"&gt;here&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;You'll find an explanation &lt;A href="https://blogs.sas.com/content/sgf/2014/02/21/what-happens-when-the-set-statement-misbehaves-and-how-you-can-fix-it/" target="_blank" rel="noopener"&gt;there&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Feb 2022 13:27:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794479#M254752</guid>
      <dc:creator>Oligolas</dc:creator>
      <dc:date>2022-02-04T13:27:33Z</dc:date>
    </item>
    <item>
      <title>Re: SET Statement with multiple datasets and missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794498#M254756</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if you add some diagnostic to the data step:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data ds_combined;
  put "1) " _ALL_;
  set ds1 ds2 indsname=i;

  put "2) " _ALL_;

  if missing(y) then
    y = x;

  put "3) " _ALL_;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;you will see in the log:&lt;/P&gt;
&lt;PRE&gt;1) i=  x=. y=. _ERROR_=0 _N_=1
2) i=WORK.DS1 x=1 y=1 _ERROR_=0 _N_=1
3) i=WORK.DS1 x=1 y=1 _ERROR_=0 _N_=1
1) i=WORK.DS1 x=1 y=1 _ERROR_=0 _N_=2
2) i=WORK.DS1 x=2 y=. _ERROR_=0 _N_=2
3) i=WORK.DS1 x=2 y=2 _ERROR_=0 _N_=2
1) i=WORK.DS1 x=2 y=2 _ERROR_=0 _N_=3
2) i=WORK.DS1 x=3 y=3 _ERROR_=0 _N_=3
3) i=WORK.DS1 x=3 y=3 _ERROR_=0 _N_=3
1) i=WORK.DS1 x=3 y=3 _ERROR_=0 _N_=4
2) i=WORK.DS2 x=4 y=. _ERROR_=0 _N_=4
3) i=WORK.DS2 x=4 y=4 _ERROR_=0 _N_=4
1) i=WORK.DS2 x=4 y=4 _ERROR_=0 _N_=5
2) i=WORK.DS2 x=5 y=4 _ERROR_=0 _N_=5
3) i=WORK.DS2 x=5 y=4 _ERROR_=0 _N_=5
1) i=WORK.DS2 x=5 y=4 _ERROR_=0 _N_=6
2) i=WORK.DS2 x=6 y=4 _ERROR_=0 _N_=6
3) i=WORK.DS2 x=6 y=4 _ERROR_=0 _N_=6
1) i=WORK.DS2 x=6 y=4 _ERROR_=0 _N_=7
&lt;/PRE&gt;
&lt;P&gt;Up to _N_ = 3 everything is ok.&lt;/P&gt;
&lt;P&gt;In the "2)" for _N_ = 4 so the first place where values form DS2 were used y was set to missing. Since it was missing the IF condition was true so value was replaced by 4. Since Y &lt;EM&gt;came from a SAS dataset&lt;/EM&gt; its value is automatically retained (see "1)" for _N_=5). Also since DS2 doesn't have y as a variable walues of y aren't overwritten.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Reversing:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;set ds2 ds1;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;will show you similar effect.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;All the best&lt;/P&gt;
&lt;P&gt;Bart&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Feb 2022 13:42:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794498#M254756</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2022-02-04T13:42:53Z</dc:date>
    </item>
    <item>
      <title>Re: SET Statement with multiple datasets and missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794503#M254757</link>
      <description>&lt;P&gt;I wouldn't say it is a severe pitfall. If one understands the behaviour it's natural way to behave. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
      <pubDate>Fri, 04 Feb 2022 13:45:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/SET-Statement-with-multiple-datasets-and-missing-columns/m-p/794503#M254757</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2022-02-04T13:45:33Z</dc:date>
    </item>
  </channel>
</rss>

