<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: MERGE multiple files (based on a particular column) and RECEIVE a conditional MERGED output file in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897364#M354600</link>
    <description>&lt;P&gt;if you don't need the result sorted, using a hash object provides you with dataset want in one step:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want; 
   set main;

   if _n_ = 1 then do;
      if 0 then set addon;

      declare hash h(dataset: 'addon');
      h.defineKey('reprisk_id');
      h.defineData('xxx');
      h.defineDone();
   end;

   if h.find() = 0;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;or change the merging step, so that an in-variable is defined for the addon dataset, too:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want_ds;
   merge main(in = inMain) addon(in = inAddon);
   by reprisk_id;

   if inMain and inAddon;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I was not able to sort the data in the order of your want dataset.&lt;/P&gt;</description>
    <pubDate>Thu, 05 Oct 2023 11:41:32 GMT</pubDate>
    <dc:creator>andreas_lds</dc:creator>
    <dc:date>2023-10-05T11:41:32Z</dc:date>
    <item>
      <title>MERGE multiple files (based on a particular column) and RECEIVE a conditional MERGED output file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897358#M354595</link>
      <description>&lt;P&gt;Hi there,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have two files (MAIN) and (ADDON) that I'd like to merge based on the variable (reprisk_ID) that is shared by BOTH FILES&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The MAIN file has two columns (1) reprisk_ID (2) value&lt;/P&gt;&lt;P&gt;The ADDON file has three columns: (1) reprisk_ID, (2) xxx&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Both files are merged to produce WANT_2 (based on my code below) that&lt;/P&gt;&lt;P&gt;1) Merges MAIN &amp;amp; ADDON (based on the variable "reprisk_ID)&lt;/P&gt;&lt;P&gt;2) Keeps the merged file where only xxx=1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;WANT_2 is that dataset output I'd like to keep, but I'm very much hoping that my SAS syntax can be minimized so that I don't need to use the syntax in RED but can only use the syntax in BLUE with minor adjustments. (Reason is the the MAIN files that I'm dealing with can be 100GB) and this process can be very slow.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The MAIN file is based on this code:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;BR /&gt;/* MAIN FILE */&lt;BR /&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;/P&gt;&lt;P&gt;data MAIN;&lt;BR /&gt;input reprisk_ID $ value;&lt;BR /&gt;Format reprisk_ID $5.;&lt;/P&gt;&lt;P&gt;datalines;&lt;BR /&gt;ABC 1000&lt;BR /&gt;BCF 2000&lt;BR /&gt;ING 3000&lt;BR /&gt;DEF 4000&lt;BR /&gt;GHI 5000&lt;BR /&gt;ING 3500&lt;BR /&gt;BCF 5900&lt;BR /&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The ADDON file is based on this code:&lt;BR /&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;BR /&gt;/* ADDON FILE */&lt;BR /&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;/P&gt;&lt;P&gt;data ADDON;&lt;BR /&gt;input reprisk_ID $ xxx;&lt;BR /&gt;format reprisk_ID $5.;&lt;/P&gt;&lt;P&gt;datalines;&lt;BR /&gt;ABC 1&lt;BR /&gt;BCF 1&lt;BR /&gt;ING 1&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'd like to get the following WANT output&lt;/P&gt;&lt;P&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;BR /&gt;/* WANT FILE (How do I get this want file?) */&lt;BR /&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;/P&gt;&lt;P&gt;data WANT;&lt;BR /&gt;input reprisk_ID $ value xxx;&lt;/P&gt;&lt;P&gt;format reprisk_ID $5.;&lt;/P&gt;&lt;P&gt;datalines;&lt;BR /&gt;ABC 1000 1&lt;BR /&gt;BCF 5900 1&lt;BR /&gt;BCF 2000 1&lt;BR /&gt;ING 3000 1&lt;BR /&gt;ING 3500 1&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The current SAS syntax I'm using is this, where I get WANT from doing an extra step in getting WANT_1 and then WANT_2 (where WANT_2 is effectively WANT)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;BR /&gt;/* CODE TO MERGE - MAIN &amp;amp; ADDON = WANT */&lt;BR /&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;%macro prep_tbl(inds,outview);&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;proc sql;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;create view &amp;amp;outview as&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;select *&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;from &amp;amp;inds&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;/* where xxx =1 */&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;order by reprisk_id &lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;quit;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;%mend;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;%prep_tbl(MAIN,v_1);&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;%prep_tbl(ADDON,v_2);&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#0000FF"&gt;data WANT_1;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;merge v_1 (in=v1) v_2;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;by reprisk_id;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;if v1;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#0000FF"&gt;run;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;/* ---------------------------------------------------------------------------------------------------------------------------- */&lt;/P&gt;&lt;P&gt;&lt;FONT color="#FF0000"&gt;data WANT_2;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;set WANT_1;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;if xxx=. then delete;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF0000"&gt;run;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;My question&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Is there a simpler way to get WANT by not doing the step highlighted in red above, but doing something simpler by manipulating the lines in blue only?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Reuben&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 11:07:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897358#M354595</guid>
      <dc:creator>reubens</dc:creator>
      <dc:date>2023-10-05T11:07:38Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE multiple files (based on a particular column) and RECEIVE a conditional MERGED output file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897364#M354600</link>
      <description>&lt;P&gt;if you don't need the result sorted, using a hash object provides you with dataset want in one step:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want; 
   set main;

   if _n_ = 1 then do;
      if 0 then set addon;

      declare hash h(dataset: 'addon');
      h.defineKey('reprisk_id');
      h.defineData('xxx');
      h.defineDone();
   end;

   if h.find() = 0;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;or change the merging step, so that an in-variable is defined for the addon dataset, too:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want_ds;
   merge main(in = inMain) addon(in = inAddon);
   by reprisk_id;

   if inMain and inAddon;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;I was not able to sort the data in the order of your want dataset.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 11:41:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897364#M354600</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2023-10-05T11:41:32Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE multiple files (based on a particular column) and RECEIVE a conditional MERGED output file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897404#M354627</link>
      <description>&lt;P&gt;You do not say exactly why you are having to use that "code in red".&lt;/P&gt;
&lt;P&gt;If it is because you have some observations in the first set that don't match&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data WANT_1;
merge v_1 (in=v1) v_2&lt;FONT color="#800080"&gt;&lt;STRONG&gt; (in=v2);&lt;/STRONG&gt;&lt;/FONT&gt;
by reprisk_id;
if v1 &lt;FONT color="#800080"&gt;&lt;STRONG&gt;and v2&lt;/STRONG&gt;&lt;/FONT&gt;;
run;&lt;/PRE&gt;
&lt;P&gt;restricts output to results from both data sets.&lt;/P&gt;
&lt;P&gt;Unless you have missing values in your source data which you did not show.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 14:52:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897404#M354627</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-10-05T14:52:52Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE multiple files (based on a particular column) and RECEIVE a conditional MERGED output file</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897433#M354634</link>
      <description>&lt;P&gt;Thanks so much! That's resolves my issue.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Another solution, which just occurred to me is just use a where= statement that let's you choose what you want to collate.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data WANT_1 (where=(xxx=1));&lt;BR /&gt;merge v_1 (in=v1) v_2;&lt;BR /&gt;/* by DSCD year; */&lt;BR /&gt;by reprisk_id;&lt;BR /&gt;if v1;&lt;BR /&gt;run;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 17:17:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-multiple-files-based-on-a-particular-column-and-RECEIVE-a/m-p/897433#M354634</guid>
      <dc:creator>reubens</dc:creator>
      <dc:date>2023-10-05T17:17:04Z</dc:date>
    </item>
  </channel>
</rss>

