<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Dropping duplicates based on a condition? in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/844664#M82288</link>
    <description>&lt;P&gt;I have a data set with many duplicates for each observation, some duplicates have the same test results while other duplicates have different test results.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;for example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Person1.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Positive&lt;/P&gt;&lt;P&gt;Person1.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Positive&amp;nbsp;&lt;/P&gt;&lt;P&gt;Person2.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Unknown&lt;/P&gt;&lt;P&gt;Person2.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Positive&lt;/P&gt;&lt;P&gt;Person3.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Unknown&lt;/P&gt;&lt;P&gt;Person3.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Missing&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the scenario for person2, I want to keep the duplicate with a positive test result over the unknown and for person1, I want to keep just one of the results since they are the same. Person3, I'd like to keep the unknown duplicate over the missing. I've already ordered the test results for positive =1 unk=2, missing =3.&amp;nbsp;&lt;/P&gt;&lt;P&gt;How can I code it to drop certain duplicates based on the test result status?&lt;/P&gt;</description>
    <pubDate>Wed, 16 Nov 2022 17:22:02 GMT</pubDate>
    <dc:creator>publichealth11</dc:creator>
    <dc:date>2022-11-16T17:22:02Z</dc:date>
    <item>
      <title>Dropping duplicates based on a condition?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/844664#M82288</link>
      <description>&lt;P&gt;I have a data set with many duplicates for each observation, some duplicates have the same test results while other duplicates have different test results.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;for example:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Person1.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Positive&lt;/P&gt;&lt;P&gt;Person1.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Positive&amp;nbsp;&lt;/P&gt;&lt;P&gt;Person2.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Unknown&lt;/P&gt;&lt;P&gt;Person2.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Positive&lt;/P&gt;&lt;P&gt;Person3.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Unknown&lt;/P&gt;&lt;P&gt;Person3.&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; Missing&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In the scenario for person2, I want to keep the duplicate with a positive test result over the unknown and for person1, I want to keep just one of the results since they are the same. Person3, I'd like to keep the unknown duplicate over the missing. I've already ordered the test results for positive =1 unk=2, missing =3.&amp;nbsp;&lt;/P&gt;&lt;P&gt;How can I code it to drop certain duplicates based on the test result status?&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2022 17:22:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/844664#M82288</guid>
      <dc:creator>publichealth11</dc:creator>
      <dc:date>2022-11-16T17:22:02Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping duplicates based on a condition?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/844670#M82289</link>
      <description>&lt;P&gt;As long as you sort the data into the order that you want to keep the results, you can use a &lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/proc/p02bhn81rn4u64n1b6l00ftdnxge.htm#n1vj0k9e0ir1o7n1vg3s4kcwiswg" target="_self"&gt;PROC SORT NODUPKEY&lt;/A&gt; to get the results&lt;BR /&gt;&lt;BR /&gt;Here's an example&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have ;
	infile cards ;
	input 
		person $
		result $ ;
	if result="Positive" then 
		sortOrder="1" ;
	else if result="Unknown" then 
		sortOrder="2" ;
	else if result="Negative" then 
		sortOrder="3" ;

	output have ;
cards ;
Person1 Negative
Person1 Positive
Person1 Unknown
Person2 Negative
Person2 Unknown
Person3 Negative
run ;


/* First sort the records into the correct order */
proc sort 
	data=have 
	out=sort1 ;
	by person sortOrder;
run ;

/* Now remove the duplicates */
proc sort nodupkey 	
	data=sort1 
	out=want ;
	by person ;
run ;
	&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 16 Nov 2022 17:51:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/844670#M82289</guid>
      <dc:creator>AMSAS</dc:creator>
      <dc:date>2022-11-16T17:51:12Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping duplicates based on a condition?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/844676#M82290</link>
      <description>&lt;P&gt;My 2 cents&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input person $ result $;
datalines;
Person1 Positive 
Person1 Positive 
Person2 Unknown  
Person2 Positive 
Person3 Unknown  
Person3 Missing  
;

proc sql;
   create table want as
   select distinct * 
   from have
   group by person 
   having whichc(result, 'Positive', 'Unknown', 'Missing')
    = min(whichc(result, 'Positive', 'Unknown', 'Missing'))
   ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;U&gt;&lt;STRONG&gt;Results:&lt;/STRONG&gt;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;person   result
Person1  Positive
Person2  Positive
Person3  Unknown&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Nov 2022 18:15:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/844676#M82290</guid>
      <dc:creator>PeterClemmensen</dc:creator>
      <dc:date>2022-11-16T18:15:05Z</dc:date>
    </item>
    <item>
      <title>Re: Dropping duplicates based on a condition?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/845954#M82315</link>
      <description>&lt;P&gt;I think it may be a good idea to check for unexpected values/errors.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Given data like this&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input person $ result $;
datalines;
Person1 Positive 
Person1 Positive 
Person2 Unknown  
Person2 Positive 
Person3 Unknown  
Person3 Missing  
Person4 Gylle
;run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;(note that I put in a not predicted value "Gylle" in the last row)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One way to go about it could be this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  array values(3) $8 _temporary_ ('Positive','Unknown','Missing');
  do until(last.person);
    set have;
    by person;
    _idx=min(_idx,whichc(result,of values(*)));
	if _idx=0 then 
	  error 'Unexpected result value: ' result;
	end;
  if _idx&amp;gt;0 then
    result=values(_idx);
  else 
    delete;
  drop _idx;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Nov 2022 14:45:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Dropping-duplicates-based-on-a-condition/m-p/845954#M82315</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2022-11-23T14:45:43Z</dc:date>
    </item>
  </channel>
</rss>

