<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Duplicated observations in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/706169#M26330</link>
    <description>&lt;P&gt;You can use the "nodupkey" function.&lt;/P&gt;&lt;P&gt;Ex:&lt;/P&gt;&lt;P&gt;PROC SORT&lt;/P&gt;&lt;P&gt;DATA = WORK.libname NODUPKEY;&lt;/P&gt;&lt;P&gt;BY variable;&lt;/P&gt;&lt;P&gt;RUN;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 16 Dec 2020 00:25:47 GMT</pubDate>
    <dc:creator>morganmetzger</dc:creator>
    <dc:date>2020-12-16T00:25:47Z</dc:date>
    <item>
      <title>Duplicated observations</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/705913#M26285</link>
      <description>&lt;P&gt;How to take out duplicated observations in a data step?&lt;/P&gt;</description>
      <pubDate>Tue, 15 Dec 2020 03:12:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/705913#M26285</guid>
      <dc:creator>JVargas</dc:creator>
      <dc:date>2020-12-15T03:12:44Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicated observations</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/705915#M26287</link>
      <description>&lt;P&gt;You can use a proc sort function to remove duplicates by a given variable. In a given dataset 'have' we can remove duplicate names by:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Proc sort data=have nodupkey;&lt;/P&gt;&lt;P&gt;by name;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The nodupkey lets you name a variable to remove duplicates named in the by statement (in this case 'name'). This code will not change the original dataset, only the output. If you need a new data set (example: a temporary set called 'want') with the duplicates removed, you can add an out statement:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Proc sort data=have nodupkey out=want;&lt;/P&gt;&lt;P&gt;by name;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Dec 2020 03:28:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/705915#M26287</guid>
      <dc:creator>CodingDiSASter</dc:creator>
      <dc:date>2020-12-15T03:28:46Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicated observations</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/705932#M26289</link>
      <description>&lt;P&gt;See Maxim 7 and 14 in &lt;A href="https://communities.sas.com/t5/SAS-Communities-Library/Maxims-of-Maximally-Efficient-SAS-Programmers/ta-p/352068" target="_self"&gt;Maxims of Maximally Efficient SAS Programmers&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you have to use a data-step due to hardly comprehensible reasons, you either need to sort the data before processing it, if it is not at least grouped by the variable identifying a duplicate, or you could use a hash-object, if the dataset is not to large - it has to fit into memory available to your sas-session.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data= have out= sorted;
  by by_variables;
run;

data want;
  set sorted;
  by by_variables;
  if first.last_by_variable;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If the data you have is grouped by the by-variables, then you can skip sorting and add "notsorted" to the by-statement in the data step.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Dec 2020 05:47:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/705932#M26289</guid>
      <dc:creator>andreas_lds</dc:creator>
      <dc:date>2020-12-15T05:47:50Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicated observations</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/706004#M26316</link>
      <description>proc sort data= have out= sorted;&lt;BR /&gt;  by by_variables;&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;data want_duplicated ;&lt;BR /&gt;  set sorted;&lt;BR /&gt;  by by_variables;&lt;BR /&gt;  if not (first.last_by_variable  and last.last_by_variable )   ;&lt;BR /&gt;run;</description>
      <pubDate>Tue, 15 Dec 2020 12:30:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/706004#M26316</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2020-12-15T12:30:20Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicated observations</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/706160#M26329</link>
      <description>&lt;P&gt;You could do this in a PROC SORT step using nodupkey, but if you want to specifically do it in a data step you could run code similar to:&lt;/P&gt;&lt;PRE&gt;DATA WORK.data;
	BY	var;
		IF	FIRST.last_var;
	RUN;&lt;/PRE&gt;&lt;P&gt;that should get rid of the duplicates.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Dec 2020 23:10:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/706160#M26329</guid>
      <dc:creator>hswdl01</dc:creator>
      <dc:date>2020-12-15T23:10:25Z</dc:date>
    </item>
    <item>
      <title>Re: Duplicated observations</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/706169#M26330</link>
      <description>&lt;P&gt;You can use the "nodupkey" function.&lt;/P&gt;&lt;P&gt;Ex:&lt;/P&gt;&lt;P&gt;PROC SORT&lt;/P&gt;&lt;P&gt;DATA = WORK.libname NODUPKEY;&lt;/P&gt;&lt;P&gt;BY variable;&lt;/P&gt;&lt;P&gt;RUN;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Dec 2020 00:25:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Duplicated-observations/m-p/706169#M26330</guid>
      <dc:creator>morganmetzger</dc:creator>
      <dc:date>2020-12-16T00:25:47Z</dc:date>
    </item>
  </channel>
</rss>

