<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: MERGE statement has more than one data set with repeats of BY values. in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430861#M106515</link>
    <description>&lt;P&gt;Check for which combination(s) of BY keys you have more than one observation in each of the both datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can do it by next code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data dup1;
 set src_temp;
       by BY account_id int_account_id ownership_indicator;
            if not (first.ownership_indicator and last.ownership_indicator);
run;

data dup2;
 set modif_temp;
       by BY account_id int_account_id ownership_indicator;
            if not (first.ownership_indicator and last.ownership_indicator);
run;

proc sql;
        select distinct account_id, int_account_id, ownership_indicator
         from dup1 as a
         inner join dup2 as b
         on 
            a.account_id = b.account_id
            a.int_account_id =b.int_account_id
            a.ownership_indicator = b.ownership_indicator 
; quit;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 25 Jan 2018 11:38:48 GMT</pubDate>
    <dc:creator>Shmuel</dc:creator>
    <dc:date>2018-01-25T11:38:48Z</dc:date>
    <item>
      <title>MERGE statement has more than one data set with repeats of BY values.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430842#M106507</link>
      <description>&lt;P&gt;LIBNAME TAGETLIB DB2 Datasrc=ABDlive SCHEMA=SASDB&amp;nbsp; AUTHDOMAIN="SASDB_AUTH" ;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data source_rep_data;&lt;BR /&gt;set tagetlib.src_cons_cibil_table_postval;&lt;BR /&gt;/*if datepart(report_date) = '30APR2015'd;*/&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;PROC SORT data=source_rep_data out=src_temp;&lt;BR /&gt;BY account_id int_account_id ownership_indicator;&lt;BR /&gt;RUN;&lt;BR /&gt;PROC SORT data=sascibil.cibil_consumer_name_add out=modif_temp;&lt;BR /&gt;BY account_id int_account_id ownership_indicator;&lt;BR /&gt;RUN;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;DATA whole_dat_after_format1;&lt;BR /&gt;MERGE src_temp (in=a) modif_temp(in=b);&lt;BR /&gt;BY account_id int_account_id ownership_indicator;&lt;BR /&gt;IF a=1 ;&lt;BR /&gt;RUN;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am running above mentioned code and in log it is written below message :-&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;FONT color="#FF6600"&gt;WARNING: Multiple lengths were specified for the BY variable ACCOUNT_ID by input data sets. This may cause unexpected results.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;WARNING: Multiple lengths were specified for the BY variable INT_ACCOUNT_ID by input data sets. This may cause unexpected results.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;NOTE: MERGE statement has more than one data set with repeats of BY values.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;NOTE: There were 1308561 observations read from the data set WORK.SRC_TEMP.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;NOTE: There were 2907358 observations read from the data set WORK.MODIF_TEMP.&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT color="#FF6600"&gt;NOTE: The data set WORK.WHOLE_DAT_AFTER_FORMAT1 has 2907358 observations and 166 variables.&lt;/FONT&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;even though i have written&amp;nbsp; if and in statement why it is showing&amp;nbsp;"&lt;SPAN&gt;NOTE: MERGE statement has more than one data set with repeats of BY values." ?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;and SRC_TEMP has 13 lakh record with above code should return the same but in actual it is returning 29 lakhs record why it is so ?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;it is due to warning ??&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Please help in above code to understand and tell me y it is returning 29 lakhs record not the 13 lakhs records.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 25 Jan 2018 10:26:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430842#M106507</guid>
      <dc:creator>MG18</dc:creator>
      <dc:date>2018-01-25T10:26:41Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE statement has more than one data set with repeats of BY values.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430844#M106508</link>
      <description>&lt;P&gt;You have a Cartesian join, de dupe your&amp;nbsp;source tables, with nodupkey or a proc summary.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Jan 2018 10:29:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430844#M106508</guid>
      <dc:creator>Rhys</dc:creator>
      <dc:date>2018-01-25T10:29:46Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE statement has more than one data set with repeats of BY values.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430859#M106513</link>
      <description>&lt;P&gt;The fact that the output has the same 2.9 million records that modif_temp has lets me suspect that you have at least one match in src_temp for every observation in modif_temp.&lt;/P&gt;
&lt;P&gt;Very quick example:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data one;
x = 1;
run;

data two;
x = 1;
output;
output;
run;

data result;
merge
  one (in=a)
  two (in=b)
;
by x;
if a;
run;

proc print data=result noobs;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Log of the merge:&lt;/P&gt;
&lt;PRE&gt;34         data result;
35         merge
36           one (in=a)
37           two (in=b)
38         ;
39         by x;
40         if a;
41         run;

NOTE: There were 1 observations read from the data set WORK.ONE.
NOTE: There were 2 observations read from the data set WORK.TWO.
NOTE: The data set WORK.RESULT has 2 observations and 1 variables.
&lt;/PRE&gt;</description>
      <pubDate>Thu, 25 Jan 2018 11:30:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430859#M106513</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-01-25T11:30:07Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE statement has more than one data set with repeats of BY values.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430860#M106514</link>
      <description>&lt;P&gt;And you need to clean up your data, so that these messages vanish:&lt;/P&gt;
&lt;PRE&gt;WARNING: Multiple lengths were specified for the BY variable ACCOUNT_ID by input data sets. This may cause unexpected results.
WARNING: Multiple lengths were specified for the BY variable INT_ACCOUNT_ID by input data sets. This may cause unexpected results.
NOTE: MERGE statement has more than one data set with repeats of BY values.
&lt;/PRE&gt;
&lt;P&gt;All these indicate conditions that can (and usually will) make the merge return incorrect results.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maxim 3: Know your data.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Jan 2018 11:33:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430860#M106514</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2018-01-25T11:33:02Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE statement has more than one data set with repeats of BY values.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430861#M106515</link>
      <description>&lt;P&gt;Check for which combination(s) of BY keys you have more than one observation in each of the both datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can do it by next code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data dup1;
 set src_temp;
       by BY account_id int_account_id ownership_indicator;
            if not (first.ownership_indicator and last.ownership_indicator);
run;

data dup2;
 set modif_temp;
       by BY account_id int_account_id ownership_indicator;
            if not (first.ownership_indicator and last.ownership_indicator);
run;

proc sql;
        select distinct account_id, int_account_id, ownership_indicator
         from dup1 as a
         inner join dup2 as b
         on 
            a.account_id = b.account_id
            a.int_account_id =b.int_account_id
            a.ownership_indicator = b.ownership_indicator 
; quit;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 25 Jan 2018 11:38:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430861#M106515</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2018-01-25T11:38:48Z</dc:date>
    </item>
    <item>
      <title>Re: MERGE statement has more than one data set with repeats of BY values.</title>
      <link>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430879#M106523</link>
      <description>&lt;P&gt;The WARNING can cause the NOTE.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The WARNING indicates that one (or more) of your BY variables is being truncated when executing the DATA step.&amp;nbsp; That can cause the data to appear like there are different match patterns occurring.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The solution is easy.&amp;nbsp; Find the larger length for each variable, and insert a LENGTH statement before the MERGE statement, assigning that maximum length.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If there are other problems with the data, they will need to be fixed.&amp;nbsp; But it is also possible that assigning the proper LENGTH will clear up everything.&lt;/P&gt;</description>
      <pubDate>Thu, 25 Jan 2018 12:48:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/MERGE-statement-has-more-than-one-data-set-with-repeats-of-BY/m-p/430879#M106523</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2018-01-25T12:48:35Z</dc:date>
    </item>
  </channel>
</rss>

