<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Listing and Removing duplicates with an indexed dataset in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813157#M34099</link>
    <description>&lt;P&gt;You could create an empty table structure with all the indexes and constraints based on your source table. Some small modification to the code I've shared &lt;A href="https://communities.sas.com/t5/SAS-Enterprise-Guide/how-to-truncate-table/m-p/812617#M40663" target="_self"&gt;here&lt;/A&gt; should do the trick.&lt;/P&gt;
&lt;P&gt;The do with your source table as you wish. In the end append the table to your empty table structure which still got all the indexes and constraints.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can of course also use approaches that change the table in-place ...but if there are a lot of duplicates (=deletes) then you'll end up with a table that contains a lot of rows with logical deletes (=which add to the file size).&lt;/P&gt;</description>
    <pubDate>Fri, 13 May 2022 09:58:56 GMT</pubDate>
    <dc:creator>Patrick</dc:creator>
    <dc:date>2022-05-13T09:58:56Z</dc:date>
    <item>
      <title>Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813041#M34082</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have a dataset that has a sas7bndx associated with it. I couldn't do a proc sort with nodupkey force because it will delete the sas7bndx.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My questions are:&lt;/P&gt;&lt;P&gt;1. If a column is named ID, what's the best way to list out the duplicated IDs?&lt;/P&gt;&lt;P&gt;2. How do I list out the duplicated IDs with another column, example, ID and Month.&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; 1 &amp;nbsp; Jan&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; 1 &amp;nbsp; Mar&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; 3 &amp;nbsp; May&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; 3 &amp;nbsp; Jun&lt;/P&gt;&lt;P&gt;3. How do I remove the duplicates when there's a sas7bndx associated with it?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Thu, 12 May 2022 17:32:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813041#M34082</guid>
      <dc:creator>cosmid</dc:creator>
      <dc:date>2022-05-12T17:32:48Z</dc:date>
    </item>
    <item>
      <title>Re: Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813043#M34083</link>
      <description>&lt;P&gt;What is stopping you from just re-creating the index?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 May 2022 17:35:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813043#M34083</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-05-12T17:35:58Z</dc:date>
    </item>
    <item>
      <title>Re: Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813069#M34084</link>
      <description>&lt;P&gt;I don't know what var(s) the original index is based on. Is there a way to find that out? And is there a way to remove the duplicates while not modifying the index file?&lt;/P&gt;</description>
      <pubDate>Thu, 12 May 2022 18:55:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813069#M34084</guid>
      <dc:creator>cosmid</dc:creator>
      <dc:date>2022-05-12T18:55:45Z</dc:date>
    </item>
    <item>
      <title>Re: Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813071#M34085</link>
      <description>&lt;P&gt;Run a proc contents on the data set to see your indexes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 May 2022 19:01:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813071#M34085</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2022-05-12T19:01:09Z</dc:date>
    </item>
    <item>
      <title>Re: Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813087#M34087</link>
      <description>&lt;P&gt;Try using the MODIFY statement.&lt;/P&gt;
&lt;P&gt;You could use a HASH object to identify the duplicates.&lt;/P&gt;
&lt;P&gt;Let's create an example dataset that has some duplicate data.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data class (index=(age));
  set sashelp.class sashelp.class;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Now let's remove the duplicate observations.&amp;nbsp; In this case we can just use NAME as the set of key variables.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data class ;
  modify class;
  if _n_=1 then do;
    declare hash h();
	rc=h.definekey('name');
	rc=h.definedata('name');
	rc=h.definedone();
  end;
  if h.add() then remove;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Tom_0-1652388178334.png" style="width: 999px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/71407iE1D6AC2B781B15C9/image-size/large?v=v2&amp;amp;px=999" role="button" title="Tom_0-1652388178334.png" alt="Tom_0-1652388178334.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Tom_0-1652388274191.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/71408iF451F2F21B122960/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Tom_0-1652388274191.png" alt="Tom_0-1652388274191.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 12 May 2022 20:44:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813087#M34087</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-05-12T20:44:42Z</dc:date>
    </item>
    <item>
      <title>Re: Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813103#M34090</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/253026"&gt;@cosmid&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Not sure whether you can actually do something with this , but posting it anyway :&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Getting all duplicates within a SAS data set&lt;BR /&gt;Posted 01-29-2015 12:52 PM | by EricCai (91388 views)&lt;BR /&gt;&lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/Getting-all-duplicates-within-a-SAS-data-set/ta-p/223575" target="_blank"&gt;https://communities.sas.com/t5/SAS-Communities-Library/Getting-all-duplicates-within-a-SAS-data-set/ta-p/223575&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Koen&lt;/P&gt;</description>
      <pubDate>Thu, 12 May 2022 22:38:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813103#M34090</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-05-12T22:38:50Z</dc:date>
    </item>
    <item>
      <title>Re: Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813157#M34099</link>
      <description>&lt;P&gt;You could create an empty table structure with all the indexes and constraints based on your source table. Some small modification to the code I've shared &lt;A href="https://communities.sas.com/t5/SAS-Enterprise-Guide/how-to-truncate-table/m-p/812617#M40663" target="_self"&gt;here&lt;/A&gt; should do the trick.&lt;/P&gt;
&lt;P&gt;The do with your source table as you wish. In the end append the table to your empty table structure which still got all the indexes and constraints.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can of course also use approaches that change the table in-place ...but if there are a lot of duplicates (=deletes) then you'll end up with a table that contains a lot of rows with logical deletes (=which add to the file size).&lt;/P&gt;</description>
      <pubDate>Fri, 13 May 2022 09:58:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813157#M34099</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2022-05-13T09:58:56Z</dc:date>
    </item>
    <item>
      <title>Re: Listing and Removing duplicates with an indexed dataset</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813188#M34104</link>
      <description>&lt;P&gt;And ... forgive me for this reply as it is a little bit off-topic&amp;nbsp;&lt;/P&gt;
&lt;P&gt;, but I want people reading this topic thread to know about :&lt;/P&gt;
&lt;H5 id="SAS.cas-table-casouttablebasic-memoryformat" class="xisCas-argument"&gt;memoryFormat&lt;SPAN class="xisCas-equals"&gt;="DVR" (&lt;SPAN&gt;duplicate value reduction memory format)&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/H5&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="xisCas-equals"&gt;&lt;SPAN&gt;SAS Viya System Programming Guide&lt;BR /&gt;Table Action Set: Syntax&lt;BR /&gt;upload Action&lt;BR /&gt;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/v_014/caspg/cas-table-upload.htm" target="_blank"&gt;https://go.documentation.sas.com/doc/en/pgmsascdc/v_014/caspg/cas-table-upload.htm&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="xisCas-equals"&gt;&lt;SPAN&gt;You can save a lot of ( in-memory ) space that way.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="xisCas-equals"&gt;&lt;SPAN&gt;Thanks,&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="xisCas-equals"&gt;&lt;SPAN&gt;Koen&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 13 May 2022 12:35:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Listing-and-Removing-duplicates-with-an-indexed-dataset/m-p/813188#M34104</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2022-05-13T12:35:08Z</dc:date>
    </item>
  </channel>
</rss>

