<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: De-duplicate value from column based on highest value from another column in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799022#M314139</link>
    <description>&lt;P&gt;Thanks for your reply. I've just realised that I've got another dataset that has a similar problem to what I posted, only that it has about an extra 100 columns along with the two (quarter and rate). If I still wanted to do the same thing (keep one record of quarter based on the highest value of rate, how would the code change (i.e. I want to have all of the c.100 columns in my dataset, with the only change being that only one unique quarter (based on highest rate) is shown with all of the information from the other c.100 columns for this row)?&lt;/P&gt;</description>
    <pubDate>Sun, 27 Feb 2022 22:22:38 GMT</pubDate>
    <dc:creator>Justin9</dc:creator>
    <dc:date>2022-02-27T22:22:38Z</dc:date>
    <item>
      <title>De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799016#M314134</link>
      <description>&lt;P&gt;Hi, I've got a dataset (called 'RC.DATA_CHECK') that has two columns: Quarter and Rate. Please can someone provide me with the required code that would only keep &lt;U&gt;one version of the Quarter&lt;/U&gt; value, based on the &lt;U&gt;highest value of Rate&lt;/U&gt;? Note that 2021Q1 has three records, two of which are the same, but I would just like to keep one of these rows for this quarter. I have included a screenshot below of what I would like my dataset to look like after the dataset has been de-duplicated.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Justin9_1-1645999282081.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/68993i3C3D234821621017/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Justin9_1-1645999282081.png" alt="Justin9_1-1645999282081.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 22:04:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799016#M314134</guid>
      <dc:creator>Justin9</dc:creator>
      <dc:date>2022-02-27T22:04:56Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799017#M314135</link>
      <description>&lt;P&gt;Find the maximum value of rate, by quarter:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=have nway;
    class quarter;
    var rate;
    output out=want max=;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 11:16:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799017#M314135</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2022-02-28T11:16:33Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799022#M314139</link>
      <description>&lt;P&gt;Thanks for your reply. I've just realised that I've got another dataset that has a similar problem to what I posted, only that it has about an extra 100 columns along with the two (quarter and rate). If I still wanted to do the same thing (keep one record of quarter based on the highest value of rate, how would the code change (i.e. I want to have all of the c.100 columns in my dataset, with the only change being that only one unique quarter (based on highest rate) is shown with all of the information from the other c.100 columns for this row)?&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 22:22:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799022#M314139</guid>
      <dc:creator>Justin9</dc:creator>
      <dc:date>2022-02-27T22:22:38Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799025#M314141</link>
      <description>&lt;P&gt;Your data appear to be sorted by quarter.&amp;nbsp; If so, then:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want (drop=max_rate);
  do until (last.quarter);
    set have;
    by quarter;
    max_rate=max(max_rate,rate);
  end;
  do until (last.quarter);
    set have;
    by quarter;
    if rate=max_rate then output;
  end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;As long as no quarter has multiple observations with the maximum rate, this will produce what you want.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It works by reading all obs for each quarter twice.&amp;nbsp; The first time to determine the maximum rate.&amp;nbsp; The second time to re-read and output the obs with that maximum.&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 22:41:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799025#M314141</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-02-27T22:41:03Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799026#M314142</link>
      <description>&lt;P&gt;Thanks for your post. My example had 2021Q1 with two observations with the maximum rate (see original post), so does that impact on the code that you've just posted?&lt;/P&gt;</description>
      <pubDate>Sun, 27 Feb 2022 22:46:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799026#M314142</guid>
      <dc:creator>Justin9</dc:creator>
      <dc:date>2022-02-27T22:46:06Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799043#M314153</link>
      <description>&lt;P&gt;Hi, can someone please tell me how the code would change if I wanted the same criteria (keep only one quarter based on the highest value of rate), but there are 100 variables/columns in the dataset (rather than just the two)?&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 01:47:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799043#M314153</guid>
      <dc:creator>Justin9</dc:creator>
      <dc:date>2022-02-28T01:47:06Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799047#M314154</link>
      <description>&lt;P&gt;Sort the data and use FIRST. processing.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort data=have out=sorted;
  by quarter descending rate;
run;

data want;
  set sorted;
  by quarter;
  if first.quarter;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;If there are variables you can use to help decide which one of the ties to pick add those to the BY statement in the proc sort step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 02:12:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799047#M314154</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-02-28T02:12:34Z</dc:date>
    </item>
    <item>
      <title>Re: De-duplicate value from column based on highest value from another column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799050#M314157</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/321018"&gt;@Justin9&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Thanks for your post. My example had 2021Q1 with two observations with the maximum rate (see original post), so does that impact on the code that you've just posted?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Then which record with the tied rates do you want?&lt;/P&gt;</description>
      <pubDate>Mon, 28 Feb 2022 02:43:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/De-duplicate-value-from-column-based-on-highest-value-from/m-p/799050#M314157</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-02-28T02:43:46Z</dc:date>
    </item>
  </channel>
</rss>

