Hi,
I would like to delete a duplicate if a customer has purchased something, and then within 30 days purchases again. (Doesn't matter what kind of purchase.)
I have a table with non-duplicate and duplicate customer numbers (CustNo). In one column I have CustNo, in another Date.
The table is sorted by CustNo then Date. I want to make a new table where I have removed duplicates with these conditions:
1. CustNo=CustNo
2. The date is in within 30 days before the next date. If there is no next date for that CustNo then keep.
E.g:
CustNo Date
123 01/11/18
123 15/11/18
123 25/11/18
123 05/12/18
123 10/01/19
456 20/11/18
Want outcome:
CustNo Date
123 05/12/18
123 10/01/19
456 20/11/18
I'm on version 7.1
Thanks 🙂
Or i am assuming you are perhaps after this
data want;
set have;
by custno;
retain d;
if first.custno and last.custno then output;
else if first.custno then d=date;
else if intck('days',d,date)>30 then do; d=date;output;end;
drop d;
run;
Just to make sure I am understanding this
05/12/18 is not removed because it is more than 30 days from the start date of 01/11/18. It doesn't matter that it is not more than 30 days since the previous purchase on 25/11/18.
data have;
input CustNo Date :ddmmyy8.;
format date ddmmyy8.;
cards;
123 01/11/18
123 15/11/18
123 25/11/18
123 05/12/18
123 10/01/19
456 20/11/18
;
data want;
set have;
by custno;
retain d;
if first.custno then do; d=date;output;end;
else if intck('days',d,date)>30 then do; d=date;output;end;
drop d;
run;
I am not sure why the 1st purchase date of custno 123 is ignored in your expected output
Or i am assuming you are perhaps after this
data want;
set have;
by custno;
retain d;
if first.custno and last.custno then output;
else if first.custno then d=date;
else if intck('days',d,date)>30 then do; d=date;output;end;
drop d;
run;
I think the example data provided is weak and doesn't cover realistic possible examples, and so I ask for clarification from @Julia5, what do you want to happen if these are the dates for one customer?
01JAN18
09JAN18
14JAN18
01MAR18
05MAR18
Should the output include 01MAR18 and 05MAR18, because they are more than 30 days from the start? Or should the output include 01MAR18 but not 05MAR18 because 05MAR18 is less than 30 days from the previous purchase while 01MAR18 is more than 30 days from any previous purchase?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.