<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Removing duplicates in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308562#M61183</link>
    <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to remove duplicate records (e.g. where plural &amp;gt;1), but I want all unique records (including where plural=1) in the dataset not just the ones with multiple records (plural&amp;gt;1). I hope this makes sense.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Tue, 01 Nov 2016 16:34:48 GMT</pubDate>
    <dc:creator>jhs2171</dc:creator>
    <dc:date>2016-11-01T16:34:48Z</dc:date>
    <item>
      <title>Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308552#M61180</link>
      <description>&lt;P&gt;Hello,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to find duplicates and only keep the unique obs (doesn't matter if it is the first or the last record). I used the variable called Plural to identify them and Plural has three categories (1, 2, and 3). Plural=1 means there are no duplicates, Plural=2 means there are two records (per person) and Plural=3 means there are three records (per person). I tried:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data nodup;&amp;nbsp;&lt;/P&gt;&lt;P&gt;set dup&amp;nbsp;&lt;/P&gt;&lt;P&gt;where plural &amp;gt;1;&amp;nbsp;&lt;/P&gt;&lt;P&gt;by time;&amp;nbsp;&lt;/P&gt;&lt;P&gt;if first.time;run;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and obviously, the dataset nodup only has the one w/ Plural&amp;gt;1. Is there any way I can keep every record in the dataset (plural=1, 2, or 3), but remove duplicates if Plural&amp;gt;1?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 01 Nov 2016 15:48:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308552#M61180</guid>
      <dc:creator>jhs2171</dc:creator>
      <dc:date>2016-11-01T15:48:29Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308557#M61181</link>
      <description>I don't think that you could use Plural solely. &lt;BR /&gt;If you still have the key to identify what you think is unique, skip plural and use PROC SORT NODUPKEY instead.</description>
      <pubDate>Tue, 01 Nov 2016 16:06:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308557#M61181</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2016-11-01T16:06:44Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308561#M61182</link>
      <description>&lt;P&gt;Please be more specific about what you want to remove.&amp;nbsp;&amp;nbsp;&amp;nbsp; When you say remove duplicates that implies remove duplicate records and yet you say you want to keep all records. what do you want to remove if not the record ?.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Nov 2016 16:15:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308561#M61182</guid>
      <dc:creator>Jim_G</dc:creator>
      <dc:date>2016-11-01T16:15:43Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308562#M61183</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want to remove duplicate records (e.g. where plural &amp;gt;1), but I want all unique records (including where plural=1) in the dataset not just the ones with multiple records (plural&amp;gt;1). I hope this makes sense.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Tue, 01 Nov 2016 16:34:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308562#M61183</guid>
      <dc:creator>jhs2171</dc:creator>
      <dc:date>2016-11-01T16:34:48Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308568#M61184</link>
      <description>&lt;P&gt;I think that you want to keep the first of the duplicate records but delete the second and third dup.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Try this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;data;&amp;nbsp;&amp;nbsp;&amp;nbsp; set;&amp;nbsp;&amp;nbsp; by time;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if first.time;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Nov 2016 16:46:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308568#M61184</guid>
      <dc:creator>Jim_G</dc:creator>
      <dc:date>2016-11-01T16:46:53Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308596#M61188</link>
      <description>&lt;P&gt;Hi Jim_G:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for your comment- Okay so I tweaked what you suggested and tried the code below: Basically, I&lt;SPAN&gt;&amp;nbsp;sorted by enough keys to get a unique combination per row; then use a data step to number each row as I want.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sort data=old; by id dob border; run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data new;&lt;BR /&gt;set old;&lt;BR /&gt;by id dob border;&lt;/P&gt;&lt;P&gt;retain kid;&lt;BR /&gt;if first.dob then &amp;nbsp; &amp;nbsp;kid= 1;&lt;BR /&gt;else &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; kid= 1+kid; run;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Not sure what I am missing here, but all the observations in my dataset have kid=1. In theory, they should be numbered 1-2-3 and then next record to a different person would be numbered 1. Any suggestion?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 01 Nov 2016 18:45:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308596#M61188</guid>
      <dc:creator>jhs2171</dc:creator>
      <dc:date>2016-11-01T18:45:49Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308622#M61190</link>
      <description>&lt;P&gt;It's looking like this statement needs to change:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if first.dob then kid=1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Probably, it should become (with no other changes required):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;if first.id then kid=1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Basically, you are numbering the records for each ID/DOB combination.&amp;nbsp; Your results indicate there is only one observation for each group.&amp;nbsp; The change will number the records for each ID.&amp;nbsp; Hopefully, that's what you are hoping to do.&lt;/P&gt;</description>
      <pubDate>Tue, 01 Nov 2016 20:23:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308622#M61190</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2016-11-01T20:23:54Z</dc:date>
    </item>
    <item>
      <title>Re: Removing duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308705#M61191</link>
      <description>&lt;P&gt;Change the statement&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if first.dob then &amp;nbsp; &amp;nbsp;kid= 1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;to:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; if first.border then kid=1;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Nov 2016 09:53:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Removing-duplicates/m-p/308705#M61191</guid>
      <dc:creator>Jim_G</dc:creator>
      <dc:date>2016-11-02T09:53:16Z</dc:date>
    </item>
  </channel>
</rss>

