<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Dedup with Condition in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885265#M349798</link>
    <description>&lt;P&gt;Stealing the example data set from &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data Have;
  input cust_id $ date datetime.;
  format date datetime21. month monyy7.;
  month = intnx('MONTH', datepart(date), 0, 'B');

datalines;
12345 21APR2023:15:48:30
12345 23APR2023:15:50:30
12345 23APR2023:17:48:30
12345 25APR2023:11:48:30
12345 21Jun2023:15:35:30
;
run;

proc summary data=have nway;
   class cust_id date;
   format date dtmonyy7. ;
   output out=want (drop=_:);
run;

proc print data=want;
   format date datetime18.;
run;&lt;/PRE&gt;
&lt;P&gt;When you know an appropriate format and apply it to a Class variable, such as the Dtmonyy, then summary will return the lowest numeric value in the format created group as the value. This particular summary otherwise counts observations so we drop those count variables as not wanted for this exercise.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note: Class may not be appropriate for large data sets with many combinations of the Class variables.&lt;/P&gt;</description>
    <pubDate>Tue, 18 Jul 2023 13:10:45 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2023-07-18T13:10:45Z</dc:date>
    <item>
      <title>Dedup with Condition</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885189#M349751</link>
      <description>&lt;P&gt;I have a dataset with many duplicates for each customer ID, what I want to achieve is keeping only one record in each month for one customer. If multiple records in a month, select the record with the earliest date. How could I achieve this? I have tried first. , last. but it only gives me one cust_ID one record.&lt;/P&gt;
&lt;P&gt;for example:&lt;/P&gt;
&lt;P&gt;cust_id date month&lt;/P&gt;
&lt;P&gt;12345 21APR23 15:48:30&lt;/P&gt;
&lt;P&gt;12345&amp;nbsp;23APR23 15:50:30&lt;/P&gt;
&lt;P&gt;12345&amp;nbsp;23APR23 17:48:30&lt;/P&gt;
&lt;P&gt;12345&amp;nbsp;25APR23 11:48:30&lt;/P&gt;
&lt;P&gt;12345&amp;nbsp;21Jun23 15:35:30&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;want:&lt;/P&gt;
&lt;P&gt;12345 21APR23 15:48:30&lt;/P&gt;
&lt;P&gt;12345&amp;nbsp;21Jun23 15:35:30&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jul 2023 02:32:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885189#M349751</guid>
      <dc:creator>zhige50</dc:creator>
      <dc:date>2023-07-18T02:32:32Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup with Condition</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885192#M349752</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data Have;
  input cust_id $ date datetime.;
  format date datetime21. month monyy7.;
  month = intnx('MONTH', datepart(date), 0, 'B');

datalines;
12345 21APR2023:15:48:30
12345 23APR2023:15:50:30
12345 23APR2023:17:48:30
12345 25APR2023:11:48:30
12345 21Jun2023:15:35:30
;
run;

proc sort data = Have;
  by cust_id month date;
run;

data Want;
  set Have;
  by  cust_id month;
  if first.month;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 18 Jul 2023 03:09:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885192#M349752</guid>
      <dc:creator>SASKiwi</dc:creator>
      <dc:date>2023-07-18T03:09:22Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup with Condition</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885199#M349759</link>
      <description>&lt;P&gt;If your dataset has a DATE variable but doesn't already have a MONTH variable, you don't need to create MONTH:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  set have;
  by cust_id ;
  if first.id=1   or   intck('month',date,lag(date))^=0;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jul 2023 04:13:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885199#M349759</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2023-07-18T04:13:31Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup with Condition</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885265#M349798</link>
      <description>&lt;P&gt;Stealing the example data set from &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13976"&gt;@SASKiwi&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data Have;
  input cust_id $ date datetime.;
  format date datetime21. month monyy7.;
  month = intnx('MONTH', datepart(date), 0, 'B');

datalines;
12345 21APR2023:15:48:30
12345 23APR2023:15:50:30
12345 23APR2023:17:48:30
12345 25APR2023:11:48:30
12345 21Jun2023:15:35:30
;
run;

proc summary data=have nway;
   class cust_id date;
   format date dtmonyy7. ;
   output out=want (drop=_:);
run;

proc print data=want;
   format date datetime18.;
run;&lt;/PRE&gt;
&lt;P&gt;When you know an appropriate format and apply it to a Class variable, such as the Dtmonyy, then summary will return the lowest numeric value in the format created group as the value. This particular summary otherwise counts observations so we drop those count variables as not wanted for this exercise.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note: Class may not be appropriate for large data sets with many combinations of the Class variables.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Jul 2023 13:10:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885265#M349798</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-07-18T13:10:45Z</dc:date>
    </item>
    <item>
      <title>Re: Dedup with Condition</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885275#M349803</link>
      <description>&lt;P&gt;Maybe give a try with a month variable and proc sort nodupkey:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;data snacks;
	set SASHELP.snacks;
	month=compress(year(Date)||month(date));
run;

proc sort data=snacks nodupkey;by product month;run;&lt;/PRE&gt;</description>
      <pubDate>Tue, 18 Jul 2023 14:21:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Dedup-with-Condition/m-p/885275#M349803</guid>
      <dc:creator>framon</dc:creator>
      <dc:date>2023-07-18T14:21:46Z</dc:date>
    </item>
  </channel>
</rss>

