<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Remove duplicate records using noduprecs and select distinct in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821988#M324540</link>
    <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15475"&gt;@andreas_lds&lt;/a&gt;&amp;nbsp;I already gone through that article and comments. My Question is how to write the noduprecs in proc sql?&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 07 Jul 2022 07:02:45 GMT</pubDate>
    <dc:creator>David_Billa</dc:creator>
    <dc:date>2022-07-07T07:02:45Z</dc:date>
    <item>
      <title>Remove duplicate records using noduprecs and select distinct</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821984#M324537</link>
      <description>&lt;P&gt;Can someone help me understand the difference between removing duplicates using noduprecs and select distinct? In below code, why I'm getting different record count?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How to write the proc sort with noduprecs in proc sql?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;148        proc sort data=BOMSING out=BOMSING_no_dup_ps dupout=removed_records_sas NODUPRECS;
149        	BY LVL1_MATRL;
150        run;

NOTE: There were 226526 observations read from the data set WORK.BOMSING.
NOTE: 206 duplicate observations were deleted.
NOTE: The data set WORK.BOMSING_NO_DUP_PS has 226320 observations and 109 variables.

NOTE: The data set WORK.REMOVED_RECORDS_SAS has 206 observations and 109 variables.
NOTE: PROCEDURE SORT used (Total process time):
      real time           0.38 seconds
      cpu time            0.39 seconds
      

151        
152        proc sql;
153        	create table work.BOMSING_no_dup_sd as
154        	select distinct * from
155        	BOMSING
156        	order by LVL1_MATRL;
NOTE: Table WORK.BOMSING_NO_DUP_SD created, with 226283 rows and 109 columns.

157        quit;
NOTE: PROCEDURE SQL used (Total process time):
      real time           0.96 seconds
      cpu time            1.47 seconds
      
&lt;/PRE&gt;</description>
      <pubDate>Thu, 07 Jul 2022 06:30:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821984#M324537</guid>
      <dc:creator>David_Billa</dc:creator>
      <dc:date>2022-07-07T06:30:46Z</dc:date>
    </item>
    <item>
      <title>Re: Remove duplicate records using noduprecs and select distinct</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821985#M324538</link>
      <description>&lt;P&gt;Please have a look at &lt;A href="https://communities.sas.com/t5/New-SAS-User/Difference-between-NODUPKEY-and-NODUPRECS/td-p/536246" target="_blank"&gt;https://communities.sas.com/t5/New-SAS-User/Difference-between-NODUPKEY-and-NODUPRECS/td-p/536246&lt;/A&gt; especially at the comment by &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jul 2022 06:48:19 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821985#M324538</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2022-07-07T06:48:19Z</dc:date>
    </item>
    <item>
      <title>Re: Remove duplicate records using noduprecs and select distinct</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821988#M324540</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15475"&gt;@andreas_lds&lt;/a&gt;&amp;nbsp;I already gone through that article and comments. My Question is how to write the noduprecs in proc sql?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jul 2022 07:02:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821988#M324540</guid>
      <dc:creator>David_Billa</dc:creator>
      <dc:date>2022-07-07T07:02:45Z</dc:date>
    </item>
    <item>
      <title>Re: Remove duplicate records using noduprecs and select distinct</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821990#M324541</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/292396"&gt;@David_Billa&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15475"&gt;@andreas_lds&lt;/a&gt;&amp;nbsp;I already gone through that article and comments. My Question is how to write the noduprecs in proc sql?&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Sorry, can't help you with this task, i hardly use proc sql, because in my eyes sql code is incredible ugly.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jul 2022 07:05:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821990#M324541</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2022-07-07T07:05:41Z</dc:date>
    </item>
    <item>
      <title>Re: Remove duplicate records using noduprecs and select distinct</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821991#M324542</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/292396"&gt;@David_Billa&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15475"&gt;@andreas_lds&lt;/a&gt;&amp;nbsp;I already gone through that article and comments. My Question is how to write the noduprecs in proc sql?&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;You can't. You would have to re-engineer what is, in essence, a&amp;nbsp;&lt;EM&gt;bug&lt;/EM&gt;.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jul 2022 07:09:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821991#M324542</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2022-07-07T07:09:35Z</dc:date>
    </item>
    <item>
      <title>Re: Remove duplicate records using noduprecs and select distinct</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821995#M324544</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/11562"&gt;@Kurt_Bremser&lt;/a&gt;&amp;nbsp;Noduprecs will remove the duplicate records based on all by variables which is similar to select distinct. Is my understanding right?&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jul 2022 07:45:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821995#M324544</guid>
      <dc:creator>David_Billa</dc:creator>
      <dc:date>2022-07-07T07:45:03Z</dc:date>
    </item>
    <item>
      <title>Re: Remove duplicate records using noduprecs and select distinct</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821997#M324546</link>
      <description>&lt;P&gt;Yes, but it does not do so reliably and consistently. That's why it is no longer documented.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To reliably get rid of duplicates, run&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=data nodupkey;
by _all_;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;which is in essence what SQL does when DISTINCT is used.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You positively must read the paper (&lt;A href="https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/037-30.pdf" target="_blank" rel="noopener"&gt;https://support.sas.com/resources/papers/proceedings/proceedings/sugi30/037-30.pdf&lt;/A&gt;) mentioned in the thread linked by&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/15475"&gt;@andreas_lds&lt;/a&gt;.&lt;/P&gt;
&lt;P&gt;Quote from there:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;DIV class="page" title="Page 1"&gt;
&lt;DIV class="section"&gt;
&lt;DIV class="layoutArea"&gt;
&lt;DIV class="column"&gt;
&lt;P&gt;&lt;SPAN&gt;One thing to beware of with both options is that they both compare the previous observation written to the output data set.&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;So, if the observations that you want eliminated are not adjacent in the data set after the sort, they will not be eliminated.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Example code that demonstrates the shortcoming of NODUPREC:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input key data;
datalines;
1 1
1 2
1 3
1 1
;

proc sort
  data=have
  out=want1
  noduprec
;
by key;
run;

title "SORT with NODUPREC";

proc print data=want1 noobs;
run;

proc sql;
create table want2 as
  select distinct *
  from have
;
quit;

title "SQL with DISTINCT";

proc print data=want2 noobs;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Result:&lt;/P&gt;
&lt;PRE&gt;SORT with NODUPREC
key	data
1	1
1	2
1	3
1	1
SQL with DISTINCT
key	data
1	1
1	2
1	3
&lt;/PRE&gt;</description>
      <pubDate>Thu, 07 Jul 2022 08:09:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Remove-duplicate-records-using-noduprecs-and-select-distinct/m-p/821997#M324546</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2022-07-07T08:09:30Z</dc:date>
    </item>
  </channel>
</rss>

