- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi again,
I have 1 rule: If Z occurs within 30 days before A or B on any var1,2,3 then remove all rows with that id. In this example persons 1 and 4 will remain, and 2 and 3 are removed. I'm having trouble with combining syntax for date comparisons and syntax for multiple conditions.
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
DATA want;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data want(drop = d1 - d3 c);
do until (last.id);
set have;
by id;
array v var:;
array d d1 - d3;
do over v;
if v = 'Z' then d = date;
if v in ('A', 'B') and c = . then c = range(date, d);
end;
end;
do until (last.id);
set have;
by id;
if c = 0 | c > 30 then output;
end;
run;
Result:
id date var1 var2 var3 1 09/21/2016 B D D 1 09/28/2016 B D D 1 10/11/2016 Z 4 09/01/2017 A 4 09/07/2017 Z 4 10/10/2017 A
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Try this
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data want(drop = d1 - d3 c);
do until (last.id);
set have;
by id;
array v var:;
array d d1 - d3;
do over v;
if v = 'Z' then d = date;
if v in ('A', 'B') and c = . then c = range(date, d);
end;
end;
do until (last.id);
set have;
by id;
if c = 0 | c > 30 then output;
end;
run;
Result:
id date var1 var2 var3 1 09/21/2016 B D D 1 09/28/2016 B D D 1 10/11/2016 Z 4 09/01/2017 A 4 09/07/2017 Z 4 10/10/2017 A
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Anytime 🙂
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
array d d1 - d3;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@Manhort , I don't understand the question?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data temp;
set have;
by id;
dif=dif(date);
if first.id then call missing(dif);
run;
data id;
set temp;
if .<dif<30 and (
(var1 in ('A' 'B') and lag(var1)='Z') or
(var2 in ('A' 'B') and lag(var2)='Z') or
(var3 in ('A' 'B') and lag(var3)='Z') ) ;
run;
proc sql;
create table want as
select * from have where id not in (select id from id);
quit;