Hi,
Participants have varying number of visits/observations in this dataset. Var is binary (0/1).
I am trying to delete observations as of the first time each participant has consecutive 0s for var for >24 months (as seen in the image)
Thanks very much
HAVE | WANT | |||||
ID | var | Date | ID | var | Date | |
1 | 1 | 28/10/2014 | 1 | 1 | 28/10/2014 | |
1 | 0 | 26/01/2015 | 1 | 0 | 26/01/2015 | |
1 | 1 | 23/04/2015 | 1 | 1 | 23/04/2015 | |
1 | 0 | 21/07/2015 | 1 | 0 | 21/07/2015 | |
1 | 0 | 21/10/2015 | 1 | 0 | 21/10/2015 | |
1 | 0 | 20/01/2016 | 1 | 0 | 20/01/2016 | |
1 | 0 | 06/05/2016 | 1 | 0 | 06/05/2016 | |
1 | 0 | 05/08/2016 | 1 | 0 | 05/08/2016 | |
1 | 0 | 08/11/2016 | 1 | 0 | 08/11/2016 | |
1 | 0 | 08/02/2017 | 1 | 0 | 08/02/2017 | |
1 | 0 | 08/05/2017 | 1 | 0 | 08/05/2017 | |
1 | 0 | 21/08/2017 | 2 | 1 | 05/03/2012 | |
1 | 0 | 17/01/2018 | 2 | 0 | 17/07/2012 | |
2 | 1 | 05/03/2012 | 2 | 1 | 29/07/2013 | |
2 | 0 | 17/07/2012 | 2 | 1 | 28/10/2014 | |
2 | 1 | 29/07/2013 | ||||
2 | 1 | 28/10/2014 |
This minor change should suffice. Let me know at your earliest convenience.
data want;
set have;
k=lag(date);
by id no notsorted;
k1=intck('month',k,date);
if first.id then flag=.;
if first.id or first.no and no=0 then do;
k=0;
k1=0;
k2=0;
end;
retain flag;
if no=0 then k2+k1;
if k2>=24 then flag=1;
if not flag;
drop k: flag;
run;
data HAVE;
input ID var Date : ddmmyy10.;
format date ddmmyy10.;
cards;
1 1 28/10/2014 1 1 28/10/2014
1 0 26/01/2015 1 0 26/01/2015
1 1 23/04/2015 1 1 23/04/2015
1 0 21/07/2015 1 0 21/07/2015
1 0 21/10/2015 1 0 21/10/2015
1 0 20/01/2016 1 0 20/01/2016
1 0 06/05/2016 1 0 06/05/2016
1 0 05/08/2016 1 0 05/08/2016
1 0 08/11/2016 1 0 08/11/2016
1 0 08/02/2017 1 0 08/02/2017
1 0 08/05/2017 1 0 08/05/2017
1 0 21/08/2017 2 1 05/03/2012
1 0 17/01/2018 2 0 17/07/2012
2 1 05/03/2012 2 1 29/07/2013
2 0 17/07/2012 2 1 28/10/2014
2 1 29/07/2013
2 1 28/10/2014
;
data want;
set have;
k=lag(date);
by id var notsorted;
k1=intck('month',k,date);
if first.var then do;k=0;k1=0;k2=0; end;
k2+k1;
if k2>=24 and var=0 then delete;
drop k:;
run;
works great, thank you!
As a follow-up question, if you don't mind:
What if I would like to delete all following observations after the condition is met?
Ex: Assume ID#1 has an additional visit at the end where var=1. With the code above, this obs remains.
How would I be able to remove this and any other values that follow?
I tried to delete the condition "and var=0" from line 8 of your code and it doesn't seem to help with this.
Thanks again!
Try
data want;
set have;
k=lag(date);
by id var notsorted;
k1=intck('month',k,date);
if first.id or first.var and var=0 then do;
k=0;
k1=0;
k2=0;
end;
if var=0 then k2+k1;
if k2>=24 then delete;
drop k:;
run;
Thanks
It doesn't delete the 0s however (only the 1).
I attached an example of what I get for an ID (ex: visits 12-15 should not be included).
Original | Output | |||||
Date | Visit no. | Var | Date | Visit no. | Var | |
17-08-11 | 1 | 1 | 17-08-11 | 1 | 1 | |
20-11-13 | 2 | 0 | 20-11-13 | 2 | 0 | |
25-02-14 | 3 | 0 | 25-02-14 | 3 | 0 | |
27-05-14 | 4 | 0 | 27-05-14 | 4 | 0 | |
04-06-15 | 5 | 0 | 04-06-15 | 5 | 0 | |
21-06-16 | 6 | 0 | 23-03-18 | 12 | 0 | |
24-11-16 | 7 | 0 | 22-06-18 | 13 | 0 | |
23-02-17 | 8 | 0 | 25-09-18 | 14 | 0 | |
19-06-17 | 9 | 0 | 28-01-19 | 15 | 0 | |
25-09-17 | 10 | 0 | ||||
22-12-17 | 11 | 1 | ||||
23-03-18 | 12 | 0 | ||||
22-06-18 | 13 | 0 | ||||
25-09-18 | 14 | 0 | ||||
28-01-19 | 15 | 0 | ||||
Hi @blackandwhite Please correct my understanding:
1. If a set of consecutive zeros is found to have with cumulative difference in months >=24 then delete all the forthcoming records beyond that for that id?
So basically the first 24 is the flag?
Please unmark the answer so that the thread is not solved
That's right thanks
Basically the first 24 months after consecutive 0s
This minor change should suffice. Let me know at your earliest convenience.
data want;
set have;
k=lag(date);
by id no notsorted;
k1=intck('month',k,date);
if first.id then flag=.;
if first.id or first.no and no=0 then do;
k=0;
k1=0;
k2=0;
end;
retain flag;
if no=0 then k2+k1;
if k2>=24 then flag=1;
if not flag;
drop k: flag;
run;
Thanks so much!
Will confirm within 24h max
works great, thank you!!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.