Hi,
Participants have varying number of visits/observations in this dataset. Var is binary (0/1).
I am trying to delete observations as of the first time each participant has consecutive 0s for var for >24 months (as seen in the image)
Thanks very much
HAVE | WANT | |||||
ID | var | Date | ID | var | Date | |
1 | 1 | 28/10/2014 | 1 | 1 | 28/10/2014 | |
1 | 0 | 26/01/2015 | 1 | 0 | 26/01/2015 | |
1 | 1 | 23/04/2015 | 1 | 1 | 23/04/2015 | |
1 | 0 | 21/07/2015 | 1 | 0 | 21/07/2015 | |
1 | 0 | 21/10/2015 | 1 | 0 | 21/10/2015 | |
1 | 0 | 20/01/2016 | 1 | 0 | 20/01/2016 | |
1 | 0 | 06/05/2016 | 1 | 0 | 06/05/2016 | |
1 | 0 | 05/08/2016 | 1 | 0 | 05/08/2016 | |
1 | 0 | 08/11/2016 | 1 | 0 | 08/11/2016 | |
1 | 0 | 08/02/2017 | 1 | 0 | 08/02/2017 | |
1 | 0 | 08/05/2017 | 1 | 0 | 08/05/2017 | |
1 | 0 | 21/08/2017 | 2 | 1 | 05/03/2012 | |
1 | 0 | 17/01/2018 | 2 | 0 | 17/07/2012 | |
2 | 1 | 05/03/2012 | 2 | 1 | 29/07/2013 | |
2 | 0 | 17/07/2012 | 2 | 1 | 28/10/2014 | |
2 | 1 | 29/07/2013 | ||||
2 | 1 | 28/10/2014 |
This minor change should suffice. Let me know at your earliest convenience.
data want;
set have;
k=lag(date);
by id no notsorted;
k1=intck('month',k,date);
if first.id then flag=.;
if first.id or first.no and no=0 then do;
k=0;
k1=0;
k2=0;
end;
retain flag;
if no=0 then k2+k1;
if k2>=24 then flag=1;
if not flag;
drop k: flag;
run;
data HAVE;
input ID var Date : ddmmyy10.;
format date ddmmyy10.;
cards;
1 1 28/10/2014 1 1 28/10/2014
1 0 26/01/2015 1 0 26/01/2015
1 1 23/04/2015 1 1 23/04/2015
1 0 21/07/2015 1 0 21/07/2015
1 0 21/10/2015 1 0 21/10/2015
1 0 20/01/2016 1 0 20/01/2016
1 0 06/05/2016 1 0 06/05/2016
1 0 05/08/2016 1 0 05/08/2016
1 0 08/11/2016 1 0 08/11/2016
1 0 08/02/2017 1 0 08/02/2017
1 0 08/05/2017 1 0 08/05/2017
1 0 21/08/2017 2 1 05/03/2012
1 0 17/01/2018 2 0 17/07/2012
2 1 05/03/2012 2 1 29/07/2013
2 0 17/07/2012 2 1 28/10/2014
2 1 29/07/2013
2 1 28/10/2014
;
data want;
set have;
k=lag(date);
by id var notsorted;
k1=intck('month',k,date);
if first.var then do;k=0;k1=0;k2=0; end;
k2+k1;
if k2>=24 and var=0 then delete;
drop k:;
run;
works great, thank you!
As a follow-up question, if you don't mind:
What if I would like to delete all following observations after the condition is met?
Ex: Assume ID#1 has an additional visit at the end where var=1. With the code above, this obs remains.
How would I be able to remove this and any other values that follow?
I tried to delete the condition "and var=0" from line 8 of your code and it doesn't seem to help with this.
Thanks again!
Try
data want;
set have;
k=lag(date);
by id var notsorted;
k1=intck('month',k,date);
if first.id or first.var and var=0 then do;
k=0;
k1=0;
k2=0;
end;
if var=0 then k2+k1;
if k2>=24 then delete;
drop k:;
run;
Thanks
It doesn't delete the 0s however (only the 1).
I attached an example of what I get for an ID (ex: visits 12-15 should not be included).
Original | Output | |||||
Date | Visit no. | Var | Date | Visit no. | Var | |
17-08-11 | 1 | 1 | 17-08-11 | 1 | 1 | |
20-11-13 | 2 | 0 | 20-11-13 | 2 | 0 | |
25-02-14 | 3 | 0 | 25-02-14 | 3 | 0 | |
27-05-14 | 4 | 0 | 27-05-14 | 4 | 0 | |
04-06-15 | 5 | 0 | 04-06-15 | 5 | 0 | |
21-06-16 | 6 | 0 | 23-03-18 | 12 | 0 | |
24-11-16 | 7 | 0 | 22-06-18 | 13 | 0 | |
23-02-17 | 8 | 0 | 25-09-18 | 14 | 0 | |
19-06-17 | 9 | 0 | 28-01-19 | 15 | 0 | |
25-09-17 | 10 | 0 | ||||
22-12-17 | 11 | 1 | ||||
23-03-18 | 12 | 0 | ||||
22-06-18 | 13 | 0 | ||||
25-09-18 | 14 | 0 | ||||
28-01-19 | 15 | 0 | ||||
Hi @blackandwhite Please correct my understanding:
1. If a set of consecutive zeros is found to have with cumulative difference in months >=24 then delete all the forthcoming records beyond that for that id?
So basically the first 24 is the flag?
Please unmark the answer so that the thread is not solved
That's right thanks
Basically the first 24 months after consecutive 0s
This minor change should suffice. Let me know at your earliest convenience.
data want;
set have;
k=lag(date);
by id no notsorted;
k1=intck('month',k,date);
if first.id then flag=.;
if first.id or first.no and no=0 then do;
k=0;
k1=0;
k2=0;
end;
retain flag;
if no=0 then k2+k1;
if k2>=24 then flag=1;
if not flag;
drop k: flag;
run;
Thanks so much!
Will confirm within 24h max
works great, thank you!!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.