<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Strange behaviour when concatenating datasets with missing columns in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810252#M319514</link>
    <description>&lt;P&gt;What happens is that the B variable, which is read (later) from another input dataset, is not reset to missing when rereading the A dataset.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The easy way out in this case is to check what table you are reading from:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test3;
set test1(in=in1) test2;
if in1 then B=A;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Alternatively, you can explicitly set the B variable missing before the SET statement:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test3;
length A B $16;
&lt;STRONG&gt;B&lt;/STRONG&gt;='';
set test1 test2;
if missing(B) then B=A;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Wed, 27 Apr 2022 18:13:49 GMT</pubDate>
    <dc:creator>s_lassen</dc:creator>
    <dc:date>2022-04-27T18:13:49Z</dc:date>
    <item>
      <title>Strange behaviour when concatenating datasets with missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810229#M319507</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have the following problem: I have two datasets test1 and test2, and want to concatenate them to create a third dataset test3.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;test1 has only column A present, while test2 has columns A and B. In test3, I want both columns to be there and, if B has a missing value, then B must be a copy of A.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test1;
length A $16;
input A $;
datalines;
ABBEY
ABBOTSFORD
KALAMAZOO
OUAGADOUGOU
;
run;&lt;BR /&gt;
data test2;
length A $16 B $16;
input A $ B $;
datalines;
OKEFENOKEE OKEFENOKEE
ALBUQUERQUE ALBUQUERQUE
;
run;

data test3;
set test1 test2;
if missing(B) then B=A;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The table test3 should (IMHO) look like this:&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.TEST3" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ABBEY&lt;/TD&gt;
&lt;TD class="l data"&gt;ABBEY&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ABBOTSFORD&lt;/TD&gt;
&lt;TD class="l data"&gt;ABBOTSFORD&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;KALAMAZOO&lt;/TD&gt;
&lt;TD class="l data"&gt;KALAMAZOO&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;OUAGADOUGOU&lt;/TD&gt;
&lt;TD class="l data"&gt;OUAGADOUGOU&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;OKEFENOKEE&lt;/TD&gt;
&lt;TD class="l data"&gt;OKEFENOKEE&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ALBUQUERQUE&lt;/TD&gt;
&lt;TD class="l data"&gt;ALBUQUERQUE&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Instead, it looks like this:&lt;/P&gt;
&lt;DIV class="branch"&gt;
&lt;DIV&gt;
&lt;DIV align="center"&gt;
&lt;TABLE class="table" summary="Procedure Print: Data Set WORK.TEST3" frame="box" rules="all" cellspacing="0" cellpadding="5"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ABBEY&lt;/TD&gt;
&lt;TD class="l data"&gt;ABBEY&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ABBOTSFORD&lt;/TD&gt;
&lt;TD class="l data"&gt;ABBEY&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;KALAMAZOO&lt;/TD&gt;
&lt;TD class="l data"&gt;ABBEY&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;OUAGADOUGOU&lt;/TD&gt;
&lt;TD class="l data"&gt;ABBEY&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;OKEFENOKEE&lt;/TD&gt;
&lt;TD class="l data"&gt;OKEFENOKEE&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD class="l data"&gt;ALBUQUERQUE&lt;/TD&gt;
&lt;TD class="l data"&gt;ALBUQUERQUE&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;I am using Base SAS 9.4.01 M7, I have verified that this behaviour exists since M2 at least.&lt;/P&gt;
&lt;P&gt;&lt;CODE class=" language-sas"&gt;&amp;nbsp;&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;Did I get anything wrong?&lt;CODE class=" language-sas"&gt;&lt;/CODE&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 27 Apr 2022 17:11:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810229#M319507</guid>
      <dc:creator>gabonzo</dc:creator>
      <dc:date>2022-04-27T17:11:23Z</dc:date>
    </item>
    <item>
      <title>Re: Strange behaviour when concatenating datasets with missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810252#M319514</link>
      <description>&lt;P&gt;What happens is that the B variable, which is read (later) from another input dataset, is not reset to missing when rereading the A dataset.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The easy way out in this case is to check what table you are reading from:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test3;
set test1(in=in1) test2;
if in1 then B=A;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Alternatively, you can explicitly set the B variable missing before the SET statement:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test3;
length A B $16;
&lt;STRONG&gt;B&lt;/STRONG&gt;='';
set test1 test2;
if missing(B) then B=A;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 27 Apr 2022 18:13:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810252#M319514</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2022-04-27T18:13:49Z</dc:date>
    </item>
    <item>
      <title>Re: Strange behaviour when concatenating datasets with missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810253#M319515</link>
      <description>&lt;P&gt;Ok, thanks for the explanation. So in practice it's like I had wrote RETAIN for column B.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is this the expected behaviour or a bug?&lt;/P&gt;
&lt;P&gt;Because, however I accept your workaround, I don't think I want to use it (or remember I must use it) every time I have to do something as simple as a variable assignment!&lt;/P&gt;</description>
      <pubDate>Wed, 27 Apr 2022 18:47:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810253#M319515</guid>
      <dc:creator>gabonzo</dc:creator>
      <dc:date>2022-04-27T18:47:57Z</dc:date>
    </item>
    <item>
      <title>Re: Strange behaviour when concatenating datasets with missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810263#M319521</link>
      <description>&lt;P&gt;Using an IN= variable is always the preferred way to identify from which dataset data is read.&lt;/P&gt;
&lt;P&gt;And the behavior is not a bug, it is well documented and is like this since the beginning of SAS. Variables coming from any incoming dataset are automatically retained. Otherwise a one-to-many MERGE would not work.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Apr 2022 19:09:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810263#M319521</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2022-04-27T19:09:02Z</dc:date>
    </item>
    <item>
      <title>Re: Strange behaviour when concatenating datasets with missing columns</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810268#M319524</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/255651"&gt;@gabonzo&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Ok, thanks for the explanation. So in practice it's like I had wrote RETAIN for column B.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is this the expected behaviour or a bug?&lt;/P&gt;
&lt;P&gt;Because, however I accept your workaround, I don't think I want to use it (or remember I must use it) every time I have to do something as simple as a variable assignment!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;All variables that are coming from input datasets are "retained".&amp;nbsp; Actually they are just not set to missing at the start of a new iteration.&amp;nbsp; Without this 1 to Many MERGE would not work.&lt;/P&gt;</description>
      <pubDate>Wed, 27 Apr 2022 19:16:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Strange-behaviour-when-concatenating-datasets-with-missing/m-p/810268#M319524</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-04-27T19:16:32Z</dc:date>
    </item>
  </channel>
</rss>

