<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Merge two datasets with creation of an identifier in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663720#M198193</link>
    <description>data want;&lt;BR /&gt;set have (in=a1) have2 (in=a2);&lt;BR /&gt;source = ifn(a1=1, 'A', 'B');&lt;BR /&gt;run;</description>
    <pubDate>Sat, 20 Jun 2020 16:21:55 GMT</pubDate>
    <dc:creator>Reeza</dc:creator>
    <dc:date>2020-06-20T16:21:55Z</dc:date>
    <item>
      <title>Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663584#M198111</link>
      <description>&lt;P&gt;Hi All!&lt;/P&gt;&lt;P&gt;Looking for a quick and&amp;nbsp;short solution (a replacement to a 2-3 steps logic I have), where I want to use a data step which will include merging two datasets (exactly same variables on both), and while merging also create a new variable called identifier which will specify whether a record coming from dataset1 or dataset2 on the output dataset.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 20:00:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663584#M198111</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-19T20:00:04Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663587#M198113</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data output;
merge a(in=ina) b(in=inb);
by &amp;lt;byvar list&amp;gt;;
if ina and inb then exists='inainb';
else if ina then exists ='ina';
else if inb then exists ='inb';
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 19 Jun 2020 20:07:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663587#M198113</guid>
      <dc:creator>smantha</dc:creator>
      <dc:date>2020-06-19T20:07:42Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663589#M198114</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/214340"&gt;@smantha&lt;/a&gt;&amp;nbsp;Thanks for a quick reply.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think I need a slightly different logic, as I do not need to check whether a record is in the first/second/both datasets.&lt;/P&gt;&lt;P&gt;I know I have exactly same records on both datasets, as second is a reduced dataset created from first dataset.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I simply need to create an indicator while merging the two datasets, so if the record coming from dataset1 it will mark it as A, but if a record coming from dataset2 it will mark it as B, and put it on a newly created variable.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this clarifies better.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 20:14:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663589#M198114</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-19T20:14:51Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663590#M198115</link>
      <description>Are you merging or appending? Usually with the same variables its an append and then you can use INDSNAME option.&lt;BR /&gt;&lt;BR /&gt;data want;&lt;BR /&gt;set have1  have2 indsname= source;&lt;BR /&gt;data_source = source;&lt;BR /&gt;run;</description>
      <pubDate>Fri, 19 Jun 2020 20:19:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663590#M198115</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-06-19T20:19:50Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663591#M198116</link>
      <description>&lt;P&gt;It would help to show the code you are currently using.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The dataset option IN= creates a variable indicating that a record came from that dataset&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Pseudo example:&lt;/P&gt;
&lt;PRE&gt;data garbage;
   merge set1 (in=in1)
              set2 (in=in2)
  ;
  if in1 and in2 then source='Both';
  else if in1 then source = 'set1';
  else if in2 then source ='set2';
run;&lt;/PRE&gt;
&lt;P&gt;I have to assume your merge would have some by variable(s) but that wouldn't make a difference in the options and if/then/else code.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 20:21:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663591#M198116</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2020-06-19T20:21:12Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663599#M198123</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;combining two datasets&lt;/P&gt;&lt;P&gt;on the dataset2 I have exactly same records and on dataset1, the difference in between the two is that dataset1 has several times more records, and all of the records on dataset2 would be found on dataset1.&lt;/P&gt;&lt;P&gt;I do not need to check if any of the records are on both or only one dataset, but simply combine the two dataset and have a new variable created which would specify either record is coming from dataset1 or dataset2.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 20:45:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663599#M198123</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-19T20:45:03Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663600#M198124</link>
      <description>&lt;P&gt;This is what I have now:&lt;/P&gt;&lt;P&gt;&amp;nbsp;- dataset1 has 10,000 records&lt;/P&gt;&lt;P&gt;&amp;nbsp;- dataset2 has 1,000 records&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data dataset1;&lt;/P&gt;&lt;P&gt;set dataset1;&lt;/P&gt;&lt;P&gt;format ds $1.;&lt;/P&gt;&lt;P&gt;ds='A';&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data dataset2;&lt;/P&gt;&lt;P&gt;set dataset2;&lt;/P&gt;&lt;P&gt;format ds $1.&lt;/P&gt;&lt;P&gt;ds='B';&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data merged;&lt;/P&gt;&lt;P&gt;set dataset1 dataset2;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Final 'merged' dataset has 11,000 records with one new variable 'ds' which has 'A' for the 10,000 records and 'B' for other 1,000 records.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 20:48:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663600#M198124</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-19T20:48:46Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663606#M198130</link>
      <description>&lt;P&gt;In here you are not merging the datasets but concatenating one to the bottom of the other. What about the overlapping records?&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 21:00:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663606#M198130</guid>
      <dc:creator>smantha</dc:creator>
      <dc:date>2020-06-19T21:00:13Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663609#M198131</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/214340"&gt;@smantha&lt;/a&gt;&amp;nbsp;yes, I should have used a concatenating, not merging.&lt;/P&gt;&lt;P&gt;Overlapping records are not an issue, there is a purpose of why these 1,000 records from dataset2 needed on the final dataset. They would be exactly same, because they were outputted in earlier steps from the dataset1. I just need a final dataset to have 11,000 records with an indicator which would be used later to differentiate whether it is coming from dataset1&amp;nbsp;or dataset&amp;nbsp;2.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 21:06:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663609#M198131</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-19T21:06:27Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663612#M198133</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/333209"&gt;@_MVB_&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;combining two datasets&lt;/P&gt;
&lt;P&gt;on the dataset2 I have exactly same records and on dataset1, the difference in between the two is that dataset1 has several times more records, and all of the records on dataset2 would be found on dataset1.&lt;/P&gt;
&lt;P&gt;I do not need to check if any of the records are on both or only one dataset, but simply combine the two dataset and have a new variable created which would specify either record is coming from dataset1 or dataset2.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Did you run it and try it?&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 21:09:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663612#M198133</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-06-19T21:09:40Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663615#M198135</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/333209"&gt;@_MVB_&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/214340"&gt;@smantha&lt;/a&gt;&amp;nbsp;yes, I should have used a concatenating, not merging.&lt;/P&gt;
&lt;P&gt;Overlapping records are not an issue, there is a purpose of why these 1,000 records from dataset2 needed on the final dataset. They would be exactly same, because they were outputted in earlier steps from the dataset1. I just need a final dataset to have 11,000 records with an indicator which would be used later to differentiate whether it is coming from dataset1&amp;nbsp;or dataset&amp;nbsp;2.&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Then look again at what&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;already proposed because that's doing what you're asking for.&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 21:12:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663615#M198135</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-06-19T21:12:12Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663625#M198140</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;it works, however I do not want to have a work.database1 or work.database2 in my output dataset. I can, of course put an extra line in that procedure where specify if work.database1 then A, else if work.database2 then B, but isn't there another way just to specify either A or B without IF THEN statement?&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 21:53:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663625#M198140</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-19T21:53:20Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663628#M198143</link>
      <description>Then use the IN approach which gives you an indicator variable instead. If you absolutely, 100% require A, B then use IF/THEN. There are automated ways to do it, but if your requirements don't align then you need to use IF/THEN. You need to make sure it handles multiple data sets as well.</description>
      <pubDate>Fri, 19 Jun 2020 21:59:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663628#M198143</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-06-19T21:59:23Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663642#M198149</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/333209"&gt;@_MVB_&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;it works, however I do not want to have a work.database1 or work.database2 in my output dataset. I can, of course put an extra line in that procedure where specify if work.database1 then A, else if work.database2 then B, but isn't there another way just to specify either A or B without IF THEN statement?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;May be we get faster to your desired result using sample data.&lt;/P&gt;
&lt;P&gt;With below sample data how should the desired result look like and how does it differ from what below code generates?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data a;
  do i=1 to 10;
    output;
  end;
  stop;
run;
data b;
  set a;
  if i in (1,5,7);
run;

data want;
  length _inds inds $41;
  set a b indsname=_inds;
  inds=_inds;
run;

proc print data=want;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Patrick_0-1592605395846.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/46402i9342335FF315EFF1/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Patrick_0-1592605395846.png" alt="Patrick_0-1592605395846.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jun 2020 22:25:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663642#M198149</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-06-19T22:25:12Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663655#M198156</link>
      <description>&lt;P&gt;In the end, I modified solution proposed by &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;just adding IF THEN statement to replace work.DATASET1 and work.DATASET2 fields with A or B.&lt;/P&gt;</description>
      <pubDate>Sat, 20 Jun 2020 00:18:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663655#M198156</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-20T00:18:15Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663657#M198158</link>
      <description>&lt;P&gt;Thanks for the feedback.&lt;/P&gt;
&lt;P&gt;Please mark one of&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;'s answers as solution so that this discussion gets closed.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 20 Jun 2020 00:28:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663657#M198158</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-06-20T00:28:28Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663720#M198193</link>
      <description>data want;&lt;BR /&gt;set have (in=a1) have2 (in=a2);&lt;BR /&gt;source = ifn(a1=1, 'A', 'B');&lt;BR /&gt;run;</description>
      <pubDate>Sat, 20 Jun 2020 16:21:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/663720#M198193</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2020-06-20T16:21:55Z</dc:date>
    </item>
    <item>
      <title>Re: Merge two datasets with creation of an identifier</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/664708#M198626</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;Thanks!&lt;/P&gt;&lt;P&gt;This is what I was looking for in terms of the short solution.&lt;/P&gt;</description>
      <pubDate>Wed, 24 Jun 2020 16:52:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Merge-two-datasets-with-creation-of-an-identifier/m-p/664708#M198626</guid>
      <dc:creator>_MVB_</dc:creator>
      <dc:date>2020-06-24T16:52:59Z</dc:date>
    </item>
  </channel>
</rss>

