<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Proc Sort data = nodupkey is deleting both records in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800960#M315177</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Thanks for all the quick responses.&amp;nbsp; I figured out my problem.&amp;nbsp; I was checking the result based on a member's last name which she had changed during the time span of the observations.&lt;/P&gt;&lt;P&gt;It's nice to have some new ideas on how to check things.&lt;/P&gt;&lt;P&gt;I appreciate the suggestions.&lt;/P&gt;&lt;P&gt;Diane&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 08 Mar 2022 23:21:32 GMT</pubDate>
    <dc:creator>abqdiane</dc:creator>
    <dc:date>2022-03-08T23:21:32Z</dc:date>
    <item>
      <title>Proc Sort data = nodupkey is deleting both records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800955#M315172</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have the following code&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;proc sort data= dataset1 out=dataset2 NODUPKEY dupout=duprecords;&lt;BR /&gt;by DOB Gender admitdate memberid Facility_2 ;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There are other fields but I want to delete duplicates only based on those.&amp;nbsp;&lt;/P&gt;&lt;P&gt;dataset1 has two identical records; instead of leaving one of the records in dataset2 and putting the other in duprecords, both records are in the duprecords dataset.&amp;nbsp; I can't figure out why.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm&amp;nbsp; using SAS Enterprise Guide 8.3&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Diane&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Mar 2022 22:42:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800955#M315172</guid>
      <dc:creator>abqdiane</dc:creator>
      <dc:date>2022-03-08T22:42:05Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Sort data = nodupkey is deleting both records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800956#M315173</link>
      <description>&lt;P&gt;I doubt it isn't working as designed.&amp;nbsp; Perhaps you are not clear of the design?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You should have unique keys in DATASET2.&amp;nbsp; In DUPRECORDS you will have the observations that did not make it into DATASET2.&amp;nbsp; So if you started with 100 observations you will have 100 observations split between the two datasets.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Try replicating the divide yourself using separate sort and data step;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data= dataset1 out=dataset1_sorted;
  by DOB Gender admitdate memberid Facility_2 ;
run;

data dataset2 duprecords;
  set dataset1_sorted ;
  by DOB Gender admitdate memberid Facility_2 ;
  if first.facility_2 then output dataset2;
  else output duprecords;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 08 Mar 2022 23:06:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800956#M315173</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-03-08T23:06:00Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Sort data = nodupkey is deleting both records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800958#M315175</link>
      <description>&lt;P&gt;You'll have to overcome decades of PROC SORT use for me to accept that NODUPKEY deletes ALL obs with duplicate BY values.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Please show the log of your PROC SORT, and any key values (&lt;SPAN&gt;DOB Gender admitdate memberid Facility_2)&amp;nbsp;&lt;/SPAN&gt;in dataset1 that does not appear in dataset2.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Mar 2022 23:15:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800958#M315175</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-03-08T23:15:26Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Sort data = nodupkey is deleting both records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800959#M315176</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/206302"&gt;@abqdiane&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/206302"&gt;@abqdiane&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;dataset1 has two identical records; instead of leaving one of the records in dataset2 and putting the other in duprecords, both records are in the duprecords dataset.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;This is the expected behavior of PROC SORT, &lt;EM&gt;if&lt;/EM&gt; further up in dataset1 there's a &lt;EM&gt;third&lt;/EM&gt; record with the same values of the BY variables.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Mar 2022 23:17:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800959#M315176</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2022-03-08T23:17:08Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Sort data = nodupkey is deleting both records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800960#M315177</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;Thanks for all the quick responses.&amp;nbsp; I figured out my problem.&amp;nbsp; I was checking the result based on a member's last name which she had changed during the time span of the observations.&lt;/P&gt;&lt;P&gt;It's nice to have some new ideas on how to check things.&lt;/P&gt;&lt;P&gt;I appreciate the suggestions.&lt;/P&gt;&lt;P&gt;Diane&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Mar 2022 23:21:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800960#M315177</guid>
      <dc:creator>abqdiane</dc:creator>
      <dc:date>2022-03-08T23:21:32Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Sort data = nodupkey is deleting both records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800961#M315178</link>
      <description>&lt;P&gt;You say that you have two duplicate RECORDS, but since you are using NODUPEKEY does another record have the same value for the BY variables but a different value for one or more other not listed variables? The first record with the values of the by variables encountered is usually the one kept.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Sort your data by those variables without the nodupekey and dupout dataset.&lt;/P&gt;
&lt;P&gt;Print it&lt;/P&gt;
&lt;P&gt;Show use the values of all the variables for the records with that sort by statement that match your "duplicate records"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 08 Mar 2022 23:25:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800961#M315178</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2022-03-08T23:25:00Z</dc:date>
    </item>
    <item>
      <title>Re: Proc Sort data = nodupkey is deleting both records</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800978#M315188</link>
      <description>DUPOUT scans dataset1  and outputs all duplicate records, so duprecords data set has double records, because record 1 is a dup of record 2, and the reverse is also true. Then NODUPKEY eliminates all the duplicate records from dataset1, so dataset2 will not have any duplicate records.</description>
      <pubDate>Wed, 09 Mar 2022 03:02:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Proc-Sort-data-nodupkey-is-deleting-both-records/m-p/800978#M315188</guid>
      <dc:creator>pink_poodle</dc:creator>
      <dc:date>2022-03-09T03:02:30Z</dc:date>
    </item>
  </channel>
</rss>

