<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: question about merge in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636539#M189125</link>
    <description>&lt;P&gt;Also do this for a start:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort
  data=A
  out=_A
  nodupkey
;
by ID form testdate;
run;

proc sort
  data=B
  out=_b
  nodupkey
;
by ID form testdate;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;and inspect the log to see if duplicate values were deleted.&lt;/P&gt;</description>
    <pubDate>Wed, 01 Apr 2020 13:34:25 GMT</pubDate>
    <dc:creator>Kurt_Bremser</dc:creator>
    <dc:date>2020-04-01T13:34:25Z</dc:date>
    <item>
      <title>question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636519#M189116</link>
      <description>&lt;P&gt;I have two datasets, A and B. A has 15353 cases and B has 15234 cases. I used the following code to merge these two datasets&lt;/P&gt;
&lt;P&gt;proc sort data=A; by ID form testdate ; run;&lt;BR /&gt;proc sort data=B; by ID form testdate; run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data ck;&lt;BR /&gt;merge A(in=in1) B(in=in2);&lt;BR /&gt;by &amp;nbsp;ID form testdate;&lt;BR /&gt;if in1=1;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In&amp;nbsp; dataset A, I created a variable inA, assigned value "Y" to indicate all cases in dataset A.&amp;nbsp; In dataset B, I created a variable inB, assigned&amp;nbsp; value "Y" to indicate all cases in dataset B. Theoretically A is supposed to have all cases of B, so in the merged dataset ck, the variable inB is either "Y" or empty.&amp;nbsp;Then I want to get cases that in A but not in B. I used the following code&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data ck1; set ck;&lt;BR /&gt;if inB="";&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It turned out ck1 has 137 cases. But the case difference between A and B should be 15353-15234=119. where I was wrong? please help.&lt;/P&gt;
&lt;P&gt;thanks!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 12:57:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636519#M189116</guid>
      <dc:creator>superbug</dc:creator>
      <dc:date>2020-04-01T12:57:22Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636532#M189121</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/304931"&gt;@superbug&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Theoretically A is supposed to have all cases of B&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Invalid assumption here. Clearly not correct, as your results show.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 13:26:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636532#M189121</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-04-01T13:26:38Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636538#M189124</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/10892"&gt;@PaigeMiller&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;dataset A is the final official data, which contains all cases. Dataset B contains less cases, I want to find out cases that in A but not in B. Any suggestion of how should I get what I wanted? thanks!&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 13:33:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636538#M189124</guid>
      <dc:creator>superbug</dc:creator>
      <dc:date>2020-04-01T13:33:55Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636539#M189125</link>
      <description>&lt;P&gt;Also do this for a start:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort
  data=A
  out=_A
  nodupkey
;
by ID form testdate;
run;

proc sort
  data=B
  out=_b
  nodupkey
;
by ID form testdate;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;and inspect the log to see if duplicate values were deleted.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 13:34:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636539#M189125</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-04-01T13:34:25Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636546#M189127</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/304931"&gt;@superbug&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Results like this are typical for a situation involving &lt;EM&gt;duplicate&lt;/EM&gt;&amp;nbsp;keys (i.e., combinations of BY variables). For example, suppose that B contains more than one observation with the same ID-form-testdate combination, but A contains this (and any other) combination only once. Then the difference "nobs(A) minus nobs(B)" (=119) will underestimate the number of combinations in A which do not occur in B.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Of course, if the assumption "B is subset of A" (in terms of BY variable combinations) turned out to be wrong, the result could occur even with unique keys on both sides.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 13:39:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636546#M189127</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-04-01T13:39:07Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636550#M189128</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;thank!&lt;/P&gt;
&lt;P&gt;I used the code you suggested, in the log, it says "0 observations with duplicate key values were deleted",&lt;/P&gt;
&lt;P&gt;that conforms to my understanding of the data. In dataset A, for each case, if by ID form testdate, it should not have any duplicate.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 13:45:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636550#M189128</guid>
      <dc:creator>superbug</dc:creator>
      <dc:date>2020-04-01T13:45:42Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636553#M189129</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/304931"&gt;@superbug&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;thank!&lt;/P&gt;
&lt;P&gt;I used the code you suggested, in the log, it says "0 observations with duplicate key values were deleted",&lt;/P&gt;
&lt;P&gt;that conforms to my understanding of the data. In dataset A, for each case, if by ID form testdate, it should not have any duplicate.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;You have not ruled out the other problem, that A does not contain all IDs in B.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 13:48:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636553#M189129</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2020-04-01T13:48:10Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636559#M189130</link>
      <description>&lt;P&gt;Then you have ID-form-testdate combinations in A that are not present in B. And possibly the other way round.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maxim 3: Know Your Data&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 14:06:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636559#M189130</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2020-04-01T14:06:41Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636563#M189131</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you! You are right.&lt;/P&gt;
&lt;P&gt;In data A, for the same ID, that person took the test on different dates with different form, so the combination of ID-form-testdate are unique. This is the same situation in data B, i.e, combination of ID-form-testdate is unique.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So what's the best way that I can look into detail on what I got (difference=137 versus&amp;nbsp;&lt;SPAN&gt;"nobs(A) minus nobs(B)" (=119)?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 14:03:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636563#M189131</guid>
      <dc:creator>superbug</dc:creator>
      <dc:date>2020-04-01T14:03:21Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636565#M189132</link>
      <description>&lt;P&gt;If the keys are unique in both A and B, then there must be key values in B which do not occur in A:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data myst;
merge a(in=a) b;
by ID form testdate;
if ~a;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 01 Apr 2020 14:09:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636565#M189132</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-04-01T14:09:31Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636571#M189134</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/32733"&gt;@FreelanceReinh&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;use the code you provided, I got 18 in B but not in A.&lt;/P&gt;
&lt;P&gt;so 119+18=137, which matches what I got.&lt;/P&gt;
&lt;P&gt;Very much appreciate your help!&lt;/P&gt;
&lt;P&gt;Thank you SO SO MUCH!&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 14:19:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636571#M189134</guid>
      <dc:creator>superbug</dc:creator>
      <dc:date>2020-04-01T14:19:53Z</dc:date>
    </item>
    <item>
      <title>Re: question about merge</title>
      <link>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636672#M189173</link>
      <description>&lt;P&gt;Please choose the most helpful answer as the solution. Not your reply.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Apr 2020 20:56:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/question-about-merge/m-p/636672#M189173</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-04-01T20:56:43Z</dc:date>
    </item>
  </channel>
</rss>

