<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delete Almost Duplicate Rows in SAS Enterprise Guide</title>
    <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37802#M4625</link>
    <description>Not sure how to do this using the EG wizards.&lt;BR /&gt;
&lt;BR /&gt;
Below a code example. I believe you've done the first 2 steps already (just sort by payroll_id and mod_date - both ascending).&lt;BR /&gt;
&lt;BR /&gt;
The last step then picks the last observation withing  a payroll_id (the last modified entry). Hope this is what you're after.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
/* create some sample data */&lt;BR /&gt;
data have;&lt;BR /&gt;
  format payroll_id 8.;&lt;BR /&gt;
  format mod_date date9.;&lt;BR /&gt;
  do payroll_id=1 to 10;&lt;BR /&gt;
    do mod_date='01jan2010'd to '31mar2010'd;&lt;BR /&gt;
      example_changing_var+1;&lt;BR /&gt;
      if ceil(ranuni(1)*100) le 5 then output;&lt;BR /&gt;
    end;&lt;BR /&gt;
  end;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
/* sort by payroll_id and mod_date */&lt;BR /&gt;
proc sort data=have out=haveSorted;&lt;BR /&gt;
  by payroll_id mod_date;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
/* pick the last observation per payroll_id */&lt;BR /&gt;
data want;&lt;BR /&gt;
  set haveSorted;&lt;BR /&gt;
  by payroll_id;&lt;BR /&gt;
  if last.payroll_id then output;&lt;BR /&gt;
run;

Message was edited by: Patrick</description>
    <pubDate>Sat, 19 Jun 2010 18:37:21 GMT</pubDate>
    <dc:creator>Patrick</dc:creator>
    <dc:date>2010-06-19T18:37:21Z</dc:date>
    <item>
      <title>Delete Almost Duplicate Rows</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37800#M4623</link>
      <description>I need help from the experts!&lt;BR /&gt;
&lt;BR /&gt;
What I have done, is create a query for a full enterprise employee list within EG.  I do not believe there are any primary keys in hte table.  The problem is that if anything changes with the employee, it creates a new entry.  Therefore I have a tons of entries for each employee.&lt;BR /&gt;
&lt;BR /&gt;
I have created two sorts.  One to sort the payroll_id # by ascending and one to sort the mod_date by decending so the last date modified is on top.&lt;BR /&gt;
&lt;BR /&gt;
I am brand spanking new to SAS and EG.  I am not sure how to get the duplicate entries out from here.  I have spent hours now on this and am totally confused.  Can anyone out there help?&lt;BR /&gt;
&lt;BR /&gt;
Dean</description>
      <pubDate>Fri, 18 Jun 2010 20:39:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37800#M4623</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2010-06-18T20:39:23Z</dc:date>
    </item>
    <item>
      <title>Re: Delete Almost Duplicate Rows</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37801#M4624</link>
      <description>Forgot to mention that I am running EG 4.1...</description>
      <pubDate>Fri, 18 Jun 2010 23:57:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37801#M4624</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2010-06-18T23:57:31Z</dc:date>
    </item>
    <item>
      <title>Re: Delete Almost Duplicate Rows</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37802#M4625</link>
      <description>Not sure how to do this using the EG wizards.&lt;BR /&gt;
&lt;BR /&gt;
Below a code example. I believe you've done the first 2 steps already (just sort by payroll_id and mod_date - both ascending).&lt;BR /&gt;
&lt;BR /&gt;
The last step then picks the last observation withing  a payroll_id (the last modified entry). Hope this is what you're after.&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
/* create some sample data */&lt;BR /&gt;
data have;&lt;BR /&gt;
  format payroll_id 8.;&lt;BR /&gt;
  format mod_date date9.;&lt;BR /&gt;
  do payroll_id=1 to 10;&lt;BR /&gt;
    do mod_date='01jan2010'd to '31mar2010'd;&lt;BR /&gt;
      example_changing_var+1;&lt;BR /&gt;
      if ceil(ranuni(1)*100) le 5 then output;&lt;BR /&gt;
    end;&lt;BR /&gt;
  end;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
/* sort by payroll_id and mod_date */&lt;BR /&gt;
proc sort data=have out=haveSorted;&lt;BR /&gt;
  by payroll_id mod_date;&lt;BR /&gt;
run;&lt;BR /&gt;
&lt;BR /&gt;
/* pick the last observation per payroll_id */&lt;BR /&gt;
data want;&lt;BR /&gt;
  set haveSorted;&lt;BR /&gt;
  by payroll_id;&lt;BR /&gt;
  if last.payroll_id then output;&lt;BR /&gt;
run;

Message was edited by: Patrick</description>
      <pubDate>Sat, 19 Jun 2010 18:37:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37802#M4625</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2010-06-19T18:37:21Z</dc:date>
    </item>
    <item>
      <title>Re: Delete Almost Duplicate Rows</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37803#M4626</link>
      <description>Okay... i did use your code... and it did produce something... but not quite sure what is produced... i tried a couple of different things to modify to try to get it to work for me... but it wouldn't take out the duplicate entries... &lt;BR /&gt;
&lt;BR /&gt;
I will post my Query PROC SQL code that came from Enterprise Guide... mabye you can make more sense of what i need to do by seeing the code that is already generated.&lt;BR /&gt;
&lt;BR /&gt;
%_eg_conditional_dropds(WORK.Query_for_agent_dgas_final);&lt;BR /&gt;
&lt;BR /&gt;
PROC SQL;&lt;BR /&gt;
 CREATE TABLE WORK.Query_for_agent_dgas_final AS SELECT DISTINCT agent_dgas_final.mod_date,&lt;BR /&gt;
	 agent_dgas_final.EMPLOYEE_NUMBER,&lt;BR /&gt;
	 agent_dgas_final.First_Name,&lt;BR /&gt;
	 agent_dgas_final.Last_Name,&lt;BR /&gt;
	 agent_dgas_final.Department,&lt;BR /&gt;
	 agent_dgas_final.Agent_Job_Class,&lt;BR /&gt;
	 agent_dgas_final.Office_ID,&lt;BR /&gt;
	 agent_dgas_final.Supr_ID,&lt;BR /&gt;
	 agent_dgas_final.Agent_login,&lt;BR /&gt;
	 agent_dgas_final.EMPLOYEE_TYPE,&lt;BR /&gt;
	 agent_dgas_final.HIRE_DATE,&lt;BR /&gt;
	 agent_dgas_final.TERMINATION_DATE,&lt;BR /&gt;
	 agent_dgas_final.CRTD_CMSID &lt;BR /&gt;
 FROM EC100020.AGENT_DGAS_FINAL AS agent_dgas_final&lt;BR /&gt;
 WHERE agent_dgas_final.EMPLOYEE_NUMBER NOT IS NULL  AND agent_dgas_final.EMPLOYEE_NUMBER &amp;gt; "000002"&lt;BR /&gt;
 GROUP BY agent_dgas_final.EMPLOYEE_NUMBER&lt;BR /&gt;
 ORDER BY agent_dgas_final.EMPLOYEE_NUMBER, agent_dgas_final.mod_date;&lt;BR /&gt;
QUIT;&lt;BR /&gt;
&lt;BR /&gt;
That gives me the list that i need... just gives me about 70-200 entries per agent... versus the one i am looking for.  I swear I will learn this backwards and forwards... the language syntax is so much different from access... and its really confusing to me..&lt;BR /&gt;
&lt;BR /&gt;
Dean&lt;BR /&gt;
&lt;BR /&gt;
Message was edited by: Dean_Z

Message was edited by: Dean_Z</description>
      <pubDate>Sat, 19 Jun 2010 22:00:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37803#M4626</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2010-06-19T22:00:52Z</dc:date>
    </item>
    <item>
      <title>Re: Delete Almost Duplicate Rows</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37804#M4627</link>
      <description>WOW...I finally figured it out... and this is what i did...&lt;BR /&gt;
&lt;BR /&gt;
Once i had my query completed...  i right clicked on results and did a sort data... from there i put in the Employee_ID into the sort area... then clicked options on the left and chose to only display the first line.  In my sort on the original query, I changed the date back to decending so the first line was the last modified line in the data... &lt;BR /&gt;
&lt;BR /&gt;
Then I ran the query and walla, the data magically appeared exactly the way i wanted.  Thanks for your help Patrick... after looking over your code a thousand times, i finally figured out what to do... thanks!!!!!!!&lt;BR /&gt;
&lt;BR /&gt;
Dean</description>
      <pubDate>Sun, 20 Jun 2010 02:27:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/Delete-Almost-Duplicate-Rows/m-p/37804#M4627</guid>
      <dc:creator>deleted_user</dc:creator>
      <dc:date>2010-06-20T02:27:52Z</dc:date>
    </item>
  </channel>
</rss>

