<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to deduplicate data based on various conditions in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/814778#M321611</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;How about working on deleting the obs using 2 steps. Meaning removing or retaining the second obs depending on the first step coding and then removing or retaining the third obs depending on a second step coding.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 23 May 2022 19:31:04 GMT</pubDate>
    <dc:creator>mayasak</dc:creator>
    <dc:date>2022-05-23T19:31:04Z</dc:date>
    <item>
      <title>How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812488#M320581</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;In the following dataset I want to remove duplicates based on several conditions:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;For "disease" = Auris, keep only first observation if "id" and "DOB" are identical.&lt;/LI&gt;&lt;LI&gt;For "disease" = Acino keep first and last obs&amp;nbsp;if "id" and "DOB" are identical.&lt;/LI&gt;&lt;LI&gt;For "disease" = CRE, if "id" and "DOB" are identical, keep all obs that have a date difference of more than 12 months, else keep the first obs and delete obs with &amp;lt; 12 months difference.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;data have;&lt;BR /&gt;input Id Disease $ DOB :mmddyy10. Date :mmddyy10.;&lt;BR /&gt;format DOB mmddyy10. date mmddyy10.;&lt;BR /&gt;datalines;&lt;BR /&gt;123 Auris 01/08/1961 01/01/2018&lt;BR /&gt;123 CRE 01/08/1961 09/02/2020&lt;BR /&gt;344 CRE 02/12/1956 08/06/2019&lt;BR /&gt;344 CRE 02/12/1956 03/06/2020&lt;BR /&gt;344 CRE 02/12/1956 03/03/2022&lt;BR /&gt;323 CRE 07/01/1993 01/06/2019&lt;BR /&gt;323 CRE 07/01/1993 09/06/2020&lt;BR /&gt;323 CRE 07/01/1993 09/31/2020&lt;BR /&gt;167 Acino 12/09/2001 03/06/2019&lt;BR /&gt;167 Acino 12/09/2001 04/31/2020&lt;BR /&gt;167 Acino 12/09/2001 09/03/2021&lt;/P&gt;&lt;P&gt;912 CRE 03/01/2012 03/03/2018&lt;BR /&gt;912 CRE 03/01/2012 05/06/2019&lt;BR /&gt;912 CRE 03/01/2012 09/06/2020&lt;BR /&gt;256 Auris 05/27/1983 08/05/2020&lt;/P&gt;&lt;P&gt;256 Auris 05/27/1983 12/07/2020&lt;/P&gt;&lt;P&gt;256 Auris 05/27/1983 10/07/2021&lt;BR /&gt;256 Auris 05/27/1983 02/07/2022&lt;BR /&gt;317 Acino 07/17/1985 12/07/2018&lt;BR /&gt;317 Acino 07/17/1985 01/03/2018&lt;/P&gt;&lt;P&gt;409 CRE 08/07/1987 03/03/2018&lt;BR /&gt;409 CRE 08/07/1987 05/06/2019&lt;BR /&gt;409 CRE 08/07/1987 09/06/2019&lt;/P&gt;&lt;P&gt;409 CRE 08/07/1987 10/06/2021&lt;BR /&gt;;;;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The result should be as this:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;123 Auris 01/08/1961 01/01/2018&lt;BR /&gt;123 CRE 01/08/1961 09/02/2020&lt;/P&gt;&lt;P&gt;344 CRE 02/12/1956 08/06/2019&lt;BR /&gt;344 CRE 02/12/1956 03/03/2022&lt;BR /&gt;323 CRE 07/01/1993 01/06/2019&lt;BR /&gt;323 CRE 07/01/1993 09/31/2020&lt;BR /&gt;167 Acino 12/09/2001 03/06/2019&lt;BR /&gt;167 Acino 12/09/2001 09/03/2021&lt;/P&gt;&lt;P&gt;912 CRE 03/01/2012 03/03/2018&lt;BR /&gt;912 CRE 03/01/2012 05/06/2019&lt;BR /&gt;912 CRE 03/01/2012 09/06/2020&lt;/P&gt;&lt;P&gt;256 Auris 05/27/1983 08/05/2020&lt;/P&gt;&lt;P&gt;317 Acino 07/17/1985 12/07/2018&lt;BR /&gt;317 Acino 07/17/1985 01/03/2018&lt;/P&gt;&lt;P&gt;409 CRE 08/07/1987 03/03/2018&lt;BR /&gt;409 CRE 08/07/1987 05/06/2019&lt;BR /&gt;409 CRE 08/07/1987 10/06/2021&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please advise,&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 May 2022 20:19:16 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812488#M320581</guid>
      <dc:creator>mayasak</dc:creator>
      <dc:date>2022-05-10T20:19:16Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812516#M320598</link>
      <description>&lt;P&gt;Thanks for writing a DATA step to generate dataset HAVE.&amp;nbsp; But did you run it and read your log?&amp;nbsp; The data has two invalid dates: 9/31/2020 and 4/31/2020).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And to clarify "&lt;SPAN&gt;keep all obs that have a date difference of more than 12 months".&amp;nbsp; Do you mean gaps of more than three months between successive records for a CRE/DOB group?&amp;nbsp; Or do you mean more than 12 months from the DATE of the first record?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 11 May 2022 01:15:26 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812516#M320598</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-05-11T01:15:26Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812519#M320600</link>
      <description>Thank you for your reply. Yes, I noticed the dates. It should be 04/30/2020 and 09/30/2020, sorry about that.&lt;BR /&gt;What I meant is obs with "date" var. which is the testing date. So if the first test was done on 01/02/2019 it will be the first event. Any other test/s for CRE before 01/02/2020 should be removed and the first test will be kept only. If this same person had more tests after 01/02/2020, such as 02/07/2021 obs will be kept. If he had another one on 04/08/2021, we should keep this last test and keep 02/07/2021. Hope this clarify &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;</description>
      <pubDate>Wed, 11 May 2022 01:37:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812519#M320600</guid>
      <dc:creator>mayasak</dc:creator>
      <dc:date>2022-05-11T01:37:46Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812523#M320602</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;What I meant is obs with "date" var. which is the testing date. So if the first test was done on 01/02/2019 it will be the first event. Any other test/s for CRE before 01/02/2020 should be removed and the first test will be kept only. If this same person had more tests after 01/02/2020, such as 02/07/2021 obs will be kept. &lt;EM&gt;&lt;STRONG&gt;If he had another one on 04/08/2021, we should keep this last test and keep 02/07/2021.&lt;/STRONG&gt;&lt;/EM&gt;&amp;nbsp;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Regarding your comment above that I put in &lt;EM&gt;&lt;STRONG&gt;bold italics&lt;/STRONG&gt;&lt;/EM&gt; - for ID 323 disease CRE your data has&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;323 CRE 07/01/1993 01/06/2019&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;323 CRE 07/01/1993 09/06/2020&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;323 CRE 07/01/1993 09/30/2020,&amp;nbsp; corrected from 09/31/2020&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;which based on your description suggests to me that you would keep the first date (01/06/2019) and the second and third dates, both of which fall after the one year mark (01/06/2020).&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;But your example expected result only has 01/06/2019 and 09/30/2020 (corrected from 09/31/2020).&amp;nbsp; Missing is 09/06/2020.&amp;nbsp; So is the expected result wrong, or is my understanding of your selection rule in error?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 11 May 2022 02:28:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812523#M320602</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-05-11T02:28:17Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812530#M320606</link>
      <description>&lt;P&gt;Assuming in the case of disease='CRE' you want the first observation and all observations more than a year later, then:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want  (where=(disease='CRE')  drop=_:);
  set have;
  by id disease dob notsorted;
  retain _cutoff_date;
  if first.dob then do;
     if disease='CRE' then _cutoff_date=intnx('year',date,1,'sameday');
     else _cutoff_date='31dec9999'd;
  end;

  if (disease='Auris' and first.dob=1)
  or (disease='Acino' and (first.dob=1 or last.dob=1))
  or (disease='CRE' and (first.dob=1 or date&amp;gt;_cutoff_date));
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;This assumes that data for a given ID/DISEASE/DOB are grouped into consecutive observations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Edit note: It also assumes that within a given ID/DISEASE/DOB group, the data are sorted by DATE.&lt;/P&gt;</description>
      <pubDate>Wed, 11 May 2022 13:53:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/812530#M320606</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-05-11T13:53:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813124#M320859</link>
      <description>&lt;P&gt;Hi mkeintz,&lt;/P&gt;&lt;P&gt;Thank you for your reply. I ran the code but it did not work as intended. The result deleted all obs for diseases (Acino and Auris) other than CRE. Also for CRE, there are some obs that should've been deleted but they're still in the result (bold italic).&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;&lt;P&gt;This is what I got as a result:&lt;/P&gt;&lt;DIV class=""&gt;&lt;DIV align="center"&gt;Obs Id Disease DOB Date1234&lt;STRONG&gt;5&lt;/STRONG&gt;&lt;STRONG&gt;6&lt;/STRONG&gt;7891011&lt;STRONG&gt;12&lt;/STRONG&gt;13 &lt;TABLE cellspacing="0" cellpadding="5"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;123&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;01/08/1961&lt;/TD&gt;&lt;TD&gt;09/02/2020&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;344&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;02/12/1956&lt;/TD&gt;&lt;TD&gt;08/06/2019&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;344&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;02/12/1956&lt;/TD&gt;&lt;TD&gt;03/03/2022&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;323&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;07/01/1993&lt;/TD&gt;&lt;TD&gt;01/06/2019&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;323&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;CRE&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;07/01/1993&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;09/06/2020&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;323&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;CRE&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;07/01/1993&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;09/30/2020&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;912&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;03/01/2012&lt;/TD&gt;&lt;TD&gt;03/03/2018&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;912&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;03/01/2012&lt;/TD&gt;&lt;TD&gt;05/06/2019&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;912&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;03/01/2012&lt;/TD&gt;&lt;TD&gt;09/06/2020&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;409&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;08/07/1987&lt;/TD&gt;&lt;TD&gt;03/03/2018&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;409&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;08/07/1987&lt;/TD&gt;&lt;TD&gt;05/06/2019&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;409&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;CRE&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;08/07/1987&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;TD&gt;&lt;EM&gt;&lt;STRONG&gt;09/06/2019&lt;/STRONG&gt;&lt;/EM&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;409&lt;/TD&gt;&lt;TD&gt;CRE&lt;/TD&gt;&lt;TD&gt;08/07/1987&lt;/TD&gt;&lt;TD&gt;10/06/2021&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 13 May 2022 04:26:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813124#M320859</guid>
      <dc:creator>mayasak</dc:creator>
      <dc:date>2022-05-13T04:26:40Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813304#M320956</link>
      <description>&lt;P&gt;Please take another look at my response.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I said&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;Assuming in the case of disease='CRE' you want the first observation and all observations more than a year later, then:&lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&lt;SPAN&gt;So my code was specitically referring to CRE&amp;nbsp; (&lt;EM&gt;&lt;STRONG&gt;"in the case of disease='CRE"&lt;/STRONG&gt;&lt;/EM&gt;).&amp;nbsp; Notice in the first statement (the DATA statement), there is a filter to keep CRE data only.&amp;nbsp; &amp;nbsp;Just remove that filter.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Second, I stated, based on your previous comments, that for CRE "you want the first observation &lt;EM&gt;&lt;STRONG&gt;and all observations more than a year later&lt;/STRONG&gt;&lt;/EM&gt;".&amp;nbsp; If that was wrong, then please restate your criterion for CRE cases, which I clearly do not understand.&amp;nbsp; &amp;nbsp;Pointing out the observations that you don't want doesn't help me, since they fit perfectly my stated understanding.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 13 May 2022 20:45:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813304#M320956</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2022-05-13T20:45:34Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813401#M321030</link>
      <description>&lt;P&gt;Hi mkeintz,&lt;/P&gt;&lt;P&gt;Thank you so much for the help. I got what you said about the CRE filtering. For the criterion, sorry for not making it clear. The criteria are as follows:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;if we have 2 obs: we keep the first obs and remove the second if it's &amp;lt; 12 months from the first. else (&amp;gt;12 months difference) we keep it.&lt;/LI&gt;&lt;LI&gt;if we have 3 obs: we keep the first and remove the second and last if both are &amp;lt; 12 months from the first (ex 1/1/2018, 1/3/2018, 3/9/2018). Else, we keep the first and last if the last is &amp;gt;12 months from the first and the second is &amp;lt; 12 months from either the first or last such as #344. Else we keep the three of them if the second is &amp;gt; 12 months from the first and &amp;gt;12 months from the last such as # 912.&amp;nbsp;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Hope this is clear now.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Sun, 15 May 2022 23:57:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813401#M321030</guid>
      <dc:creator>mayasak</dc:creator>
      <dc:date>2022-05-15T23:57:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813407#M321033</link>
      <description>&lt;P&gt;Are you looking to count distinct incidences of disease?&amp;nbsp; Is the one year window form the start of the incident? Or from the previous observation?&amp;nbsp; For example if you had one record per month for 2 years is that one incident or two?&lt;/P&gt;</description>
      <pubDate>Mon, 16 May 2022 00:39:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813407#M321033</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-05-16T00:39:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813642#M321148</link>
      <description>&lt;P&gt;Hi Tom,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi Tom,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The one-year window is from the previous obs (test). For example,&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;if the person had an obs on 1/5/2019 (first, kept always).&lt;/LI&gt;&lt;LI&gt;All other obs that happened before 1/5/2020 should be removed.&lt;/LI&gt;&lt;LI&gt;If the second obs was on 6/8/2020, it should be kept if no other obs happened after it or if the third one is also 1 year apart from the second, such as on 7/9/2022.&amp;nbsp; In this case, we keep the three.&lt;/LI&gt;&lt;LI&gt;If the third one was on 7/9/2020, only the first (1/5/2019 and the last 7/9/2020 should be kept and the second obs 6/8/2020 should be removed.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 17 May 2022 00:14:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/813642#M321148</guid>
      <dc:creator>mayasak</dc:creator>
      <dc:date>2022-05-17T00:14:39Z</dc:date>
    </item>
    <item>
      <title>Re: How to deduplicate data based on various conditions</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/814778#M321611</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;How about working on deleting the obs using 2 steps. Meaning removing or retaining the second obs depending on the first step coding and then removing or retaining the third obs depending on a second step coding.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 23 May 2022 19:31:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-deduplicate-data-based-on-various-conditions/m-p/814778#M321611</guid>
      <dc:creator>mayasak</dc:creator>
      <dc:date>2022-05-23T19:31:04Z</dc:date>
    </item>
  </channel>
</rss>

