<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: merging problem: ..Repeats of by values(please ignore 1st &amp;amp; 2nd posts) in SAS Studio</title>
    <link>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625634#M8814</link>
    <description>&lt;P&gt;Your post is a little to long to follow clearly.&lt;/P&gt;
&lt;P&gt;The note you are referencing means that in more than one of the datasets you are merging the BY variables (ID in your case) do NOT uniquely identify the observations.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So how is your data structured?&amp;nbsp;&amp;nbsp;Did you expect all four of those datasets to have only one observation for each ID? Did you expect three of them have that condition?&amp;nbsp; Only in those cases does merging by ID makes sense.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If not then what do you want to do when there are 2 observations for ID=1 in one of the datasets and 3 observations for ID=1 in another?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is it possible you want to first merge some of the datasets by some combination of variables that will uniquely identify the observations?&lt;/P&gt;</description>
    <pubDate>Tue, 18 Feb 2020 16:40:33 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2020-02-18T16:40:33Z</dc:date>
    <item>
      <title>merging problem: ..Repeats of by values(please ignore 1st &amp; 2nd posts)</title>
      <link>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625577#M8812</link>
      <description>&lt;DIV&gt;
&lt;DIV class="sasSource"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;
&lt;PRE&gt;&lt;BR /&gt;Hi,&lt;BR /&gt;There were errors in my previous 2 posts. Ignore them and consider this one instead.&lt;BR /&gt;Could you please help me to resolve this issue? I merged 4 datasets by id(a common variable) &lt;BR /&gt;and had a message "Merge statement has.....repeats of by values". Any with the SAS code to avoid this&lt;BR /&gt;statement? I have huge datasets but these ones are just subsets of the larger ones.&lt;BR /&gt;My ultimate aim is to count the number of ca case ca cont pop cont in  the status(S and NS) &lt;BR /&gt;variables in the merged dataset Table 1 final(attached pdf).&lt;BR /&gt;Thanks in advance for your expert assistance.&lt;BR /&gt;ak.&lt;BR /&gt;&lt;BR /&gt;/*Pollutants*/&lt;BR /&gt;data d1;&lt;BR /&gt;input id$ 1-5 job 7 id_job$ 9-15 hcl_exp 17 amo_exp 19 bio_exp 21 cla_exp 23;&lt;BR /&gt;datalines;&lt;BR /&gt;OSa03 4 OSa03_4 1 0 0 0&lt;BR /&gt;OSa06 3	OSa06_3 0 1 0 0 &lt;BR /&gt;OSa13 1	OSa13_1 0 1 1 0&lt;BR /&gt;OSa13 3	OSa13_3 0 1 1 1&lt;BR /&gt;OSa29 2	OSa29_2 0 0 0 1&lt;BR /&gt;OSa29 4	OSa29_4 0 1 1 0&lt;BR /&gt;OSa30 4	OSa30_4 0 0 1 0&lt;BR /&gt;OSa30 1	OSa30_1 1 0 0 0&lt;BR /&gt;OSa30 2	OSa30_2 0 1 1 1&lt;BR /&gt;OSa54 3	OSa54_3 0 1 0 0&lt;BR /&gt;OSa64 3	OSa64_3 0 1 0 0&lt;BR /&gt;OSa73 3	OSa73_3 0 0 0 1&lt;BR /&gt;OSa74 3	OSa74_3 1 0 0 0&lt;BR /&gt;OSa78 3	OSa78_3 0 1 0 0&lt;BR /&gt;;&lt;BR /&gt;proc sort data=d1; by id; run;&lt;BR /&gt;&lt;BR /&gt;/*  Cancer subjects*/&lt;BR /&gt;data d2;&lt;BR /&gt;input id$ 1-5 lung$ 7-15;&lt;BR /&gt;datalines;&lt;BR /&gt;OSa01 Pop cont&lt;BR /&gt;OSa06 Ca cont&lt;BR /&gt;OSa11 Pop cont&lt;BR /&gt;OSa13 Ca case&lt;BR /&gt;OSa29 Ca cont&lt;BR /&gt;OSa30 Ca case&lt;BR /&gt;OSa31 Ca cont&lt;BR /&gt;OSa54 Pop cont&lt;BR /&gt;OSa73 Pop cont&lt;BR /&gt;;&lt;BR /&gt;proc sort data=d2; by id; run;&lt;BR /&gt;/* Exposure level*/&lt;BR /&gt;data d3;&lt;BR /&gt;input id$ 1-5 job 7 idchem 9-15 level 16;&lt;BR /&gt;datalines;&lt;BR /&gt;OSa03 4 211701 3&lt;BR /&gt;OSa06 3	210701 3&lt;BR /&gt;OSa13 1	210701 3&lt;BR /&gt;OSa13 1	990021 3&lt;BR /&gt;OSa13 3	210701 3&lt;BR /&gt;OSa13 3	990005 3&lt;BR /&gt;OSa13 3	990021 2&lt;BR /&gt;OSa29 2	990005 3&lt;BR /&gt;OSa29 4	210701 3&lt;BR /&gt;OSa30 1 990021 3&lt;BR /&gt;OSa30 2	211701 3&lt;BR /&gt;OSa30 3	210701 3&lt;BR /&gt;OSa30 3	990005 3&lt;BR /&gt;OSa30 3	990021 3&lt;BR /&gt;OSa54 3	990005 3&lt;BR /&gt;OSa64 3	210701 2&lt;BR /&gt;OSa74 1 211701 3&lt;BR /&gt;OSa78 4	210701 3&lt;BR /&gt;OSa78 4	990005 3&lt;BR /&gt;OSa78 4	990021 3&lt;BR /&gt;;&lt;BR /&gt;proc sort data=d3; by id; run;&lt;BR /&gt;&lt;BR /&gt;/* Exposure Duration*/&lt;BR /&gt;data d4;&lt;BR /&gt;input id$ 1-5 idchem 7-12 status$ 14-15 duration 16-18;&lt;BR /&gt;datalines;&lt;BR /&gt;OSa03 211701 S 6&lt;BR /&gt;OSa06 210701 S 9&lt;BR /&gt;OSa13 210701 S 37&lt;BR /&gt;OSa13 990005 S 5&lt;BR /&gt;OSa13 990021 S 37&lt;BR /&gt;OSa29 210701 NS 12&lt;BR /&gt;OSa29 990005 S 2&lt;BR /&gt;OSa30 210701 S 8&lt;BR /&gt;OSa30 211701 NS 8&lt;BR /&gt;OSa30 990005 S  8&lt;BR /&gt;OSa30 990021 S 15&lt;BR /&gt;OSa54 210701 NS 14&lt;BR /&gt;OSa64 210701 S 15&lt;BR /&gt;OSa74 211701 NS 21&lt;BR /&gt;OSa78 210701 NS 20&lt;BR /&gt;OSa78 990005 S 20&lt;BR /&gt;OSa78 990021 S 20&lt;BR /&gt;OSa86 990005 S 14&lt;BR /&gt;OSa93 210701 S 4&lt;BR /&gt;OSa93 990005 S 13&lt;BR /&gt;;&lt;BR /&gt;&lt;BR /&gt;proc sort data=d4; by id; run;&lt;BR /&gt;&lt;BR /&gt;/* Merging d1,d2,d3 and d4*/&lt;BR /&gt;data mg4; &lt;BR /&gt;merge d1 d2 d3 d4; by id;&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;proc print data=mg4;&lt;BR /&gt;title "Table 1 final. Merged datasets(d1,d2,d3,d4)"; run;&lt;/PRE&gt;
&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;72&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;73 /*Pollutants*/&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="sasSource"&gt;74 data d1;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;75 input id$ 1-5 job 7 id_job$ 9-15 hcl_exp 17 amo_exp 19 bio_exp 21 cla_exp 23;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;76 datalines;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote1_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D1 has 14 observations and 7 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote2_1582023372372" class="sasNote"&gt;NOTE: DATA statement used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;91 ;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;92 proc sort data=d1; by id; run;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote3_1582023372372" class="sasNote"&gt;NOTE: There were 14 observations read from the data set WORK.D1.&lt;/DIV&gt;
&lt;DIV id="sasLogNote4_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D1 has 14 observations and 7 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote5_1582023372372" class="sasNote"&gt;NOTE: PROCEDURE SORT used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.02 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;93&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;94 /* Cancer subjects*/&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;95 data d2;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;96 input id$ 1-5 lung$ 7-15;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;97 datalines;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote6_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D2 has 9 observations and 2 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote7_1582023372372" class="sasNote"&gt;NOTE: DATA statement used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;107 ;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;108 proc sort data=d2; by id; run;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote8_1582023372372" class="sasNote"&gt;NOTE: There were 9 observations read from the data set WORK.D2.&lt;/DIV&gt;
&lt;DIV id="sasLogNote9_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D2 has 9 observations and 2 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote10_1582023372372" class="sasNote"&gt;NOTE: PROCEDURE SORT used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;109 /* Exposure level*/&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;110 data d3;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;111 input id$ 1-5 job 7 idchem 9-15 level 16;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;112 datalines;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote11_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D3 has 20 observations and 4 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote12_1582023372372" class="sasNote"&gt;NOTE: DATA statement used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;133 ;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;134 proc sort data=d3; by id; run;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote13_1582023372372" class="sasNote"&gt;NOTE: There were 20 observations read from the data set WORK.D3.&lt;/DIV&gt;
&lt;DIV id="sasLogNote14_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D3 has 20 observations and 4 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote15_1582023372372" class="sasNote"&gt;NOTE: PROCEDURE SORT used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;135&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;136 /* Exposure Duration*/&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;137 data d4;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;138 input id$ 1-5 idchem 7-12 status$ 14-15 duration 16-18;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;139 datalines;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote16_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D4 has 20 observations and 4 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote17_1582023372372" class="sasNote"&gt;NOTE: DATA statement used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;160 ;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;161&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;162 proc sort data=d4; by id; run;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote18_1582023372372" class="sasNote"&gt;NOTE: There were 20 observations read from the data set WORK.D4.&lt;/DIV&gt;
&lt;DIV id="sasLogNote19_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.D4 has 20 observations and 4 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote20_1582023372372" class="sasNote"&gt;NOTE: PROCEDURE SORT used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.01 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;163&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;164 /* Merging d1,d2,d3 and d4*/&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;165 data mg4;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;166 merge d1 d2 d3 d4; by id;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;167 run;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV id="sasLogNote21_1582023372372" class="sasNote"&gt;NOTE: MERGE statement has more than one data set with repeats of BY values.&lt;/DIV&gt;
&lt;DIV id="sasLogNote22_1582023372372" class="sasNote"&gt;NOTE: There were 14 observations read from the data set WORK.D1.&lt;/DIV&gt;
&lt;DIV id="sasLogNote23_1582023372372" class="sasNote"&gt;NOTE: There were 9 observations read from the data set WORK.D2.&lt;/DIV&gt;
&lt;DIV id="sasLogNote24_1582023372372" class="sasNote"&gt;NOTE: There were 20 observations read from the data set WORK.D3.&lt;/DIV&gt;
&lt;DIV id="sasLogNote25_1582023372372" class="sasNote"&gt;NOTE: There were 20 observations read from the data set WORK.D4.&lt;/DIV&gt;
&lt;DIV id="sasLogNote26_1582023372372" class="sasNote"&gt;NOTE: The data set WORK.MG4 has 27 observations and 12 variables.&lt;/DIV&gt;
&lt;DIV id="sasLogNote27_1582023372372" class="sasNote"&gt;NOTE: DATA statement used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.04 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.03 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;168&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;169 proc print data=mg4;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;170 title "Table 1 final. Merged datasets(d1,d2,d3,d4)"; run;&lt;/DIV&gt;
&lt;DIV id="sasLogNote28_1582023372372" class="sasNote"&gt;NOTE: There were 27 observations read from the data set WORK.MG4.&lt;/DIV&gt;
&lt;DIV id="sasLogNote29_1582023372372" class="sasNote"&gt;NOTE: PROCEDURE PRINT used (Total process time):&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;real time 0.51 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;cpu time 0.51 seconds&lt;/DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV class="sasNote"&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;171&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;172 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;&lt;/DIV&gt;
&lt;DIV class="sasSource"&gt;184&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;LI-WRAPPER&gt;&lt;/LI-WRAPPER&gt;&lt;/P&gt;
&lt;PRE id="pre_sasLog_81" class="sasLog"&gt;&amp;nbsp;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Feb 2020 11:09:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625577#M8812</guid>
      <dc:creator>ak2011</dc:creator>
      <dc:date>2020-02-18T11:09:12Z</dc:date>
    </item>
    <item>
      <title>Re: merging problem: ..Repeats of by values(please ignore 1st &amp; 2nd posts)</title>
      <link>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625632#M8813</link>
      <description>&lt;P&gt;You are merging 2 or more data sets via the MERGE statement in combination with the BY statement. Consider the&amp;nbsp; case when merging only 2 datasets A and B, by variable ID.&amp;nbsp;&amp;nbsp;&amp;nbsp; If, for ID=1, dataset A has 2 obs and dataset B has 4 obs, the result will have 4&amp;nbsp; obs.&amp;nbsp; Both datasets have repeats of the BY variable. &amp;nbsp; Or it could be, for ID=1 A has 1 obs, B has 2,&amp;nbsp; and&amp;nbsp; for ID=2 it's the opposite (A has 2 obs and B has 1).&amp;nbsp; These cases will generate the messages you report.&amp;nbsp; There is nothing wrong in your program.&amp;nbsp; It's just thet many time a user expects that only one dataset (or maybe no dataset) has repeats of ID.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BUT ...&amp;nbsp; whichever dataset has the shorter BY-group will have its final obs duplicated for all the subsequent matches with the dataset having the longer BY-group.&amp;nbsp; So the question is: what do you want to do about these situations? Do you want to propagate observations from the shorter group, or not?&amp;nbsp; There is a simple way to eliminate that behavior if needed.&amp;nbsp;&amp;nbsp; I.e. if for a given ID N(a)=2 and N(b)=4 you could have 4 obs, but the 3rd and 4th obs could have all missing values for the variables from A, which might be your preference.&amp;nbsp;&amp;nbsp; The issue is probably how you want to count status S and&amp;nbsp; NS.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Feb 2020 16:33:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625632#M8813</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-02-18T16:33:00Z</dc:date>
    </item>
    <item>
      <title>Re: merging problem: ..Repeats of by values(please ignore 1st &amp; 2nd posts)</title>
      <link>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625634#M8814</link>
      <description>&lt;P&gt;Your post is a little to long to follow clearly.&lt;/P&gt;
&lt;P&gt;The note you are referencing means that in more than one of the datasets you are merging the BY variables (ID in your case) do NOT uniquely identify the observations.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So how is your data structured?&amp;nbsp;&amp;nbsp;Did you expect all four of those datasets to have only one observation for each ID? Did you expect three of them have that condition?&amp;nbsp; Only in those cases does merging by ID makes sense.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If not then what do you want to do when there are 2 observations for ID=1 in one of the datasets and 3 observations for ID=1 in another?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is it possible you want to first merge some of the datasets by some combination of variables that will uniquely identify the observations?&lt;/P&gt;</description>
      <pubDate>Tue, 18 Feb 2020 16:40:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625634#M8814</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-02-18T16:40:33Z</dc:date>
    </item>
    <item>
      <title>Re: merging problem: ..Repeats of by values(please ignore 1st &amp; 2nd posts)</title>
      <link>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625755#M8816</link>
      <description>Thank you. Yes, I want all four datasets to have only one observation for each id, that is why I merged by id. I thought maybe there might be another approach to handle the situation to avoid the "merge   statement.....repeats of by values:. If you  have a clue, please let me known.&lt;BR /&gt;ak</description>
      <pubDate>Wed, 19 Feb 2020 02:12:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625755#M8816</guid>
      <dc:creator>ak2011</dc:creator>
      <dc:date>2020-02-19T02:12:23Z</dc:date>
    </item>
    <item>
      <title>Re: merging problem: ..Repeats of by values(please ignore 1st &amp; 2nd posts)</title>
      <link>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625766#M8817</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/190754"&gt;@ak2011&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Thank you. Yes, I want all four datasets to have only one observation for each id, that is why I merged by id. I thought maybe there might be another approach to handle the situation to avoid the "merge statement.....repeats of by values:. If you have a clue, please let me known.&lt;BR /&gt;ak&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;If you want only one obs per id in each source dataset, then you have to decide on a rule to determine which observation to keep available for the subsequent merge. &amp;nbsp;If dataset D1 has two records with ID=1, how do you know which record you want?&amp;nbsp; Unless those records are complete duplicates of each other, your choice may depend on the values of the non ID variables. &amp;nbsp; Or it may be that the variables of interest to your task may not change within any repeated ID - in which case you can just keep, say, the first record for each ID.&amp;nbsp; Regardless, once you've implemented such a rule, you will then be able do the merge without getting the warning message.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BTW, your log note doesn't tell you how many datasets have repeats of BY values.&amp;nbsp; Do you know which datasets have this condition?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Feb 2020 02:46:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Studio/merging-problem-Repeats-of-by-values-please-ignore-1st-amp-2nd/m-p/625766#M8817</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-02-19T02:46:09Z</dc:date>
    </item>
  </channel>
</rss>

