<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: remove samples that are repeated in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489340#M127754</link>
    <description>&lt;P&gt;Are you looking to &lt;EM&gt;&lt;STRONG&gt;"remove"&lt;/STRONG&gt;&lt;/EM&gt;&amp;nbsp;or assign missing values to the group that has more than one record?&lt;/P&gt;</description>
    <pubDate>Thu, 23 Aug 2018 17:32:39 GMT</pubDate>
    <dc:creator>novinosrin</dc:creator>
    <dc:date>2018-08-23T17:32:39Z</dc:date>
    <item>
      <title>remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489338#M127752</link>
      <description>&lt;P&gt;I want to remove the samples that are repeated. For example, if I have data like:&lt;/P&gt;&lt;P&gt;Sample&amp;nbsp; &amp;nbsp; v1&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;&lt;P&gt;B&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;&lt;P&gt;B&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 2&lt;/P&gt;&lt;P&gt;C&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I want the output like:&lt;/P&gt;&lt;P&gt;Sample&amp;nbsp; &amp;nbsp; v1&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;C&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So not just remove the duplicate (I know how to do that).&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2018 17:24:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489338#M127752</guid>
      <dc:creator>y_fu</dc:creator>
      <dc:date>2018-08-23T17:24:54Z</dc:date>
    </item>
    <item>
      <title>Re: remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489339#M127753</link>
      <description>&lt;P&gt;So you still want the records but the values to be erased?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can set them to missing using CALL MISSING() and the same technique you would use to remove duplicates.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
set have;
by sample;

if not (first.sample and last.sample) then call missing(sample, v1);

run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 23 Aug 2018 17:30:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489339#M127753</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2018-08-23T17:30:02Z</dc:date>
    </item>
    <item>
      <title>Re: remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489340#M127754</link>
      <description>&lt;P&gt;Are you looking to &lt;EM&gt;&lt;STRONG&gt;"remove"&lt;/STRONG&gt;&lt;/EM&gt;&amp;nbsp;or assign missing values to the group that has more than one record?&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2018 17:32:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489340#M127754</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-08-23T17:32:39Z</dc:date>
    </item>
    <item>
      <title>Re: remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489344#M127757</link>
      <description>&lt;P&gt;Thank you for the help, I want to remove them, not changing to missing calls.&lt;/P&gt;&lt;P&gt;Based on your code, I can just change&amp;nbsp;&lt;/P&gt;&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="token operator"&gt;not&lt;/SPAN&gt; &lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;first&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;sample and last&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;sample&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token keyword"&gt;then&lt;/SPAN&gt; call &lt;SPAN class="token function"&gt;missing&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;sample&lt;SPAN class="token punctuation"&gt;,&lt;/SPAN&gt; v1&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;to&lt;/P&gt;&lt;PRE class=" language-sas"&gt;&lt;CODE class="  language-sas"&gt;&lt;SPAN class="token keyword"&gt;if&lt;/SPAN&gt; &lt;SPAN class="token operator"&gt;not&lt;/SPAN&gt; &lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token function"&gt;first&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;sample and last&lt;SPAN class="token punctuation"&gt;.&lt;/SPAN&gt;sample&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt; &lt;SPAN class="token keyword"&gt;then&lt;/SPAN&gt; delete&lt;SPAN class="token punctuation"&gt;;&lt;/SPAN&gt;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Works perfect, thanks again.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2018 17:41:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489344#M127757</guid>
      <dc:creator>y_fu</dc:creator>
      <dc:date>2018-08-23T17:41:48Z</dc:date>
    </item>
    <item>
      <title>Re: remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489346#M127758</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input Sample:$ v1;
cards;  
A   1
B   1
B   2
C   1
;

proc sql;
create table want as select sample,sum(v1) as sum, v1 from have group by sample having sum=1;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 23 Aug 2018 17:45:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489346#M127758</guid>
      <dc:creator>Jagadishkatam</dc:creator>
      <dc:date>2018-08-23T17:45:58Z</dc:date>
    </item>
    <item>
      <title>Re: remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489352#M127761</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/12151"&gt;@Jagadishkatam&lt;/a&gt;&amp;nbsp;I do like the sql approach possibly even extending the sort to be performed intrinsically at in database level however I'm afraid logic of your code will produce erroneous results in case of the just an addition of one more group D as illustrated below. On a sorted dataset,&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13879"&gt;@Reeza&lt;/a&gt;&amp;nbsp;solution tweaked by OP&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/222061"&gt;@y_fu&lt;/a&gt;&amp;nbsp;is simplest&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input Sample:$ v1;
cards;  
A   1
B   1
B   2
C   1
D   0
D   1
;
dm log 'clear';
proc sql;
create table want_yours as 
select sample,sum(v1) as sum, v1 
from have 
group by sample 
having sum=1;
quit;
data want_reeza_and_OP;
set have;
by sample;
if not (first.sample and last.sample) then delete;
run;

proc sql;
create table want_mine as 
select * 
from have 
group by sample 
having count(sample)=1;
quit;


&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;My 2 cents for what it's worth &amp;amp; Regards!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2018 18:04:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489352#M127761</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-08-23T18:04:13Z</dc:date>
    </item>
    <item>
      <title>Re: remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489356#M127764</link>
      <description>&lt;P&gt;More notes, Sample being sorted The performance of&amp;nbsp; data step approach should essentially beat the SQL &lt;U&gt;in my opinion&lt;/U&gt; for the reason "remerging is an overhead or in other words an extra pass as I believe first and last pointers are quicker as opposed to &lt;U&gt;&lt;EM&gt;group, count , remerge and filter in sql.&amp;nbsp;&lt;/EM&gt;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2018 18:12:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489356#M127764</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-08-23T18:12:47Z</dc:date>
    </item>
    <item>
      <title>Re: remove samples that are repeated</title>
      <link>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489366#M127766</link>
      <description>&lt;P&gt;And if your dataset isn't sorted, Hash is handy&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Fun:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input Sample:$ v1;
cards;  
A   1
B   2
C   1
D   0
B   1
D   1
;
data _null_;
if _n_=1 then do;
dcl hash h(ordered:'y');
 h.definekey  ("sample") ;
 h.definedata ("sample","v1",'_N_') ;
 h.definedone () ;
end;
set have end=lr;
if h.check() ne 0 then do; _N_=1; h.replace();end;
else do;_N_=_N_+1; h.replace();end;
if lr then h.output(dataset:'want(where=(_N_=1))');
run;


&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2018 18:45:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/remove-samples-that-are-repeated/m-p/489366#M127766</guid>
      <dc:creator>novinosrin</dc:creator>
      <dc:date>2018-08-23T18:45:52Z</dc:date>
    </item>
  </channel>
</rss>

