<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Appending Non-Duplicate based on three variables in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365628#M86865</link>
    <description>&lt;P&gt;This is how I'm doing it now.. now it is less time consuming and giving what I wanted :&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;proc sort data=A;&lt;BR /&gt;by Var1 Var2 Var3;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=B;&lt;BR /&gt;by Var1 Var2 Var3;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;data pre_want;&lt;BR /&gt;merge A&amp;nbsp;(in=a) B&amp;nbsp;(in=b);&lt;BR /&gt;byVar1 Var2 Var3;&lt;BR /&gt;if not a and b;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;proc append base=A data=pre_want;&lt;BR /&gt;run;&lt;/P&gt;</description>
    <pubDate>Fri, 09 Jun 2017 11:10:58 GMT</pubDate>
    <dc:creator>atul_desh</dc:creator>
    <dc:date>2017-06-09T11:10:58Z</dc:date>
    <item>
      <title>Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365310#M86743</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've two datasets, Lets say A and B, &amp;nbsp;there are &amp;nbsp;215&amp;nbsp;Variables in each of the dataset and datasize is 90 G.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to append B dataset into A but not to append the duplicate values which are common based on "Three Variables, not based on all 215 variable".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can't use&amp;nbsp;(syncadd=no uniquesave=yes) and also can't use below code, as it will check all variables for duplicity.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;proc sql feedback;&lt;BR /&gt;create table lib_w.want&lt;BR /&gt;select * from lib_A.dsA&lt;BR /&gt;union corresponding&lt;BR /&gt;select * from lib_B.dsB&lt;BR /&gt;;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm thininking to creat new varialbe as &amp;nbsp;catx(all three variable) in both A and B and then &amp;nbsp;"A.catx_var ne B.Catx_Var"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;is there is other way around ?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please help, Thanks in Advance.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jun 2017 10:46:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365310#M86743</guid>
      <dc:creator>atul_desh</dc:creator>
      <dc:date>2017-06-08T10:46:15Z</dc:date>
    </item>
    <item>
      <title>Re: Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365313#M86744</link>
      <description>&lt;P&gt;So, you have 2 datasets which are 90gb each? &amp;nbsp;That is a really large amount of data. &amp;nbsp;Do you want to drop data from A or B based on being in both? &amp;nbsp;As a suggestion, you could get a distinct list of the 3 variables from the table you want to keep data, and then drop from the other before doing a proc append:&lt;/P&gt;
&lt;PRE&gt;proc sort data=a out=distlist nodupkey;
  by var1 var2 var3;
run;

data _null_;
  set distlist end=last;
  if _n_=1 then call execute('data b_proc; set b;');
  call execute('if var1="'||strip(var1)||'" and var2="'||strip(var2)||'" and var3="'||strip(var3)||'" then delete;');
  if last then call execute('run;');
run;

proc append base=a data=b_proc force;
run;&lt;/PRE&gt;
&lt;P&gt;This will create a big datastep with lots of if statements to drop the data from B based on a distinct list of var1-var3 from A. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jun 2017 10:56:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365313#M86744</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2017-06-08T10:56:18Z</dc:date>
    </item>
    <item>
      <title>Re: Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365316#M86745</link>
      <description>&lt;P&gt;what if I append all the observation and in last I do nodupkey ?&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jun 2017 11:06:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365316#M86745</guid>
      <dc:creator>atul_desh</dc:creator>
      <dc:date>2017-06-08T11:06:16Z</dc:date>
    </item>
    <item>
      <title>Re: Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365318#M86747</link>
      <description>&lt;P&gt;Theres only one way to know, try it. &amp;nbsp;The proc append literally should just drop the header block and tag the data onto the existing dataset, so that shouldn't matter either way, but the proc sort/datastep is where the real work is being done, so anything to minimize that.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Jun 2017 11:16:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365318#M86747</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2017-06-08T11:16:49Z</dc:date>
    </item>
    <item>
      <title>Re: Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365327#M86750</link>
      <description>&lt;P&gt;Did you try Hash Table ? Or try SQL.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
create table only_in_B as
select var1,var2,var3  from B
except
select var1,var2,var3 from A ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 08 Jun 2017 12:24:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365327#M86750</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2017-06-08T12:24:30Z</dc:date>
    </item>
    <item>
      <title>Re: Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365383#M86771</link>
      <description>how can use only_in_b for append b to a ??</description>
      <pubDate>Thu, 08 Jun 2017 14:01:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365383#M86771</guid>
      <dc:creator>atul_desh</dc:creator>
      <dc:date>2017-06-08T14:01:59Z</dc:date>
    </item>
    <item>
      <title>Re: Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365628#M86865</link>
      <description>&lt;P&gt;This is how I'm doing it now.. now it is less time consuming and giving what I wanted :&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;proc sort data=A;&lt;BR /&gt;by Var1 Var2 Var3;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=B;&lt;BR /&gt;by Var1 Var2 Var3;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;data pre_want;&lt;BR /&gt;merge A&amp;nbsp;(in=a) B&amp;nbsp;(in=b);&lt;BR /&gt;byVar1 Var2 Var3;&lt;BR /&gt;if not a and b;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;proc append base=A data=pre_want;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2017 11:10:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365628#M86865</guid>
      <dc:creator>atul_desh</dc:creator>
      <dc:date>2017-06-09T11:10:58Z</dc:date>
    </item>
    <item>
      <title>Re: Appending Non-Duplicate based on three variables</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365631#M86868</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/94664"&gt;@atul_desh&lt;/a&gt;: You could try with a hash table for checking if a row is in table A (assuming your key variables are called VAR1, VAR2 and VAR3):&lt;/P&gt;&lt;PRE&gt;Data A; /* we are modifying A in place */
  if 0 then modify A; /* we are not reading any obs. here */
  if _N_=1 then do;
    declare hash h(dataset: 'A(keep=var1 var2 var3)');
    rc=h.definekey('var1','var2','var3');
    h.definedone();
    end;
  set B;
  if h.find() then
    output A; /* if key not found, append to A */
run;&lt;/PRE&gt;</description>
      <pubDate>Fri, 09 Jun 2017 11:27:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Appending-Non-Duplicate-based-on-three-variables/m-p/365631#M86868</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2017-06-09T11:27:07Z</dc:date>
    </item>
  </channel>
</rss>

