<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Getting one record from near duplicate records in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57052#M15926</link>
    <description>Hi All&lt;BR /&gt;
&lt;BR /&gt;
I have a dataset that has near duplicate records - not exactly duplicates - take for example&lt;BR /&gt;
&lt;BR /&gt;
1234   PHay   ABSS   EFGH  $20524  20Jun2008&lt;BR /&gt;
1234   PHay   ABCD  EFGH  $11000  25Jun2008&lt;BR /&gt;
&lt;BR /&gt;
Now, what I am trying to do is take only the latest record by date (in this case I want to retreive only the second field) - how can I acheive this?&lt;BR /&gt;
&lt;BR /&gt;
I tried the Max() SQL Function, but then I need to use a function on all of the columns, as I need to get all the columns.&lt;BR /&gt;
&lt;BR /&gt;
If I used MAX(Date), MAX(Salary), etc  I would get back the first records' $20524 value, which is not correct.&lt;BR /&gt;
&lt;BR /&gt;
Any help much appreciated.&lt;BR /&gt;
&lt;BR /&gt;
Thanks!!

Message was edited by: sdcruz</description>
    <pubDate>Mon, 27 Oct 2008 09:12:11 GMT</pubDate>
    <dc:creator>deleted_user</dc:creator>
    <dc:date>2008-10-27T09:12:11Z</dc:date>
    <item>
      <title>Getting one record from near duplicate records</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57052#M15926</link>
      <description>Hi All&lt;BR /&gt;
&lt;BR /&gt;
I have a dataset that has near duplicate records - not exactly duplicates - take for example&lt;BR /&gt;
&lt;BR /&gt;
1234   PHay   ABSS   EFGH  $20524  20Jun2008&lt;BR /&gt;
1234   PHay   ABCD  EFGH  $11000  25Jun2008&lt;BR /&gt;
&lt;BR /&gt;
Now, what I am trying to do is take only the latest record by date (in this case I want to retreive only the second field) - how can I acheive this?&lt;BR /&gt;
&lt;BR /&gt;
I tried the Max() SQL Function, but then I need to use a function on all of the columns, as I need to get all the columns.&lt;BR /&gt;
&lt;BR /&gt;
If I used MAX(Date), MAX(Salary), etc  I would get back the first records' $20524 value, which is not correct.&lt;BR /&gt;
&lt;BR /&gt;
Any help much appreciated.&lt;BR /&gt;
&lt;BR /&gt;
Thanks!!

Message was edited by: sdcruz</description>
      <pubDate>Mon, 27 Oct 2008 09:12:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57052#M15926</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-10-27T09:12:11Z</dc:date>
    </item>
    <item>
      <title>Re: Getting one record from near duplicate records</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57053#M15927</link>
      <description>If you want to use SQL, here's on solution:&lt;BR /&gt;
&lt;BR /&gt;
26         proc sql;&lt;BR /&gt;
27         select id, name, str1, str2, salary, date&lt;BR /&gt;
28         from salary&lt;BR /&gt;
29         group by id&lt;BR /&gt;
30         having date = max(date);&lt;BR /&gt;
NOTE: The query requires remerging summary statistics back with the original data.&lt;BR /&gt;
31         quit;&lt;BR /&gt;
&lt;BR /&gt;
As you see, this creates one summary query that is merged with original query, so this might be the best method if your data is huge. Other methods could involve two subsequent PROC SORT, the second one using NODUPKEY on your id column.&lt;BR /&gt;
&lt;BR /&gt;
/Linus</description>
      <pubDate>Mon, 27 Oct 2008 11:10:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57053#M15927</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2008-10-27T11:10:32Z</dc:date>
    </item>
    <item>
      <title>Re: Getting one record from near duplicate records</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57054#M15928</link>
      <description>That should do the job:&lt;BR /&gt;
&lt;BR /&gt;
proc sort data=yourdata;&lt;BR /&gt;
  by id date;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
data want;&lt;BR /&gt;
  set yourdata;&lt;BR /&gt;
  by id date;&lt;BR /&gt;
  if last.id then output;&lt;BR /&gt;
run;</description>
      <pubDate>Mon, 27 Oct 2008 11:45:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57054#M15928</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2008-10-27T11:45:59Z</dc:date>
    </item>
    <item>
      <title>Re: Getting one record from near duplicate records</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57055#M15929</link>
      <description>Hi Guys&lt;BR /&gt;
&lt;BR /&gt;
Thanks alot for your help - yes I resolved this by using the PROC SQL procedure as outlined by Linus.&lt;BR /&gt;
&lt;BR /&gt;
Thanks again - much appreciated!&lt;BR /&gt;
&lt;BR /&gt;
Cheers&lt;BR /&gt;
Shelton.</description>
      <pubDate>Tue, 28 Oct 2008 05:21:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Getting-one-record-from-near-duplicate-records/m-p/57055#M15929</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2008-10-28T05:21:20Z</dc:date>
    </item>
  </channel>
</rss>

