Hi again,
I have 1 rule: If Z occurs within 30 days before A or B on any var1,2,3 then remove all rows with that id. In this example persons 1 and 4 will remain, and 2 and 3 are removed. I'm having trouble with combining syntax for date comparisons and syntax for multiple conditions.
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
DATA want;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
Thanks!
Try this
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data want(drop = d1 - d3 c);
do until (last.id);
set have;
by id;
array v var:;
array d d1 - d3;
do over v;
if v = 'Z' then d = date;
if v in ('A', 'B') and c = . then c = range(date, d);
end;
end;
do until (last.id);
set have;
by id;
if c = 0 | c > 30 then output;
end;
run;
Result:
id date var1 var2 var3 1 09/21/2016 B D D 1 09/28/2016 B D D 1 10/11/2016 Z 4 09/01/2017 A 4 09/07/2017 Z 4 10/10/2017 A
Try this
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data want(drop = d1 - d3 c);
do until (last.id);
set have;
by id;
array v var:;
array d d1 - d3;
do over v;
if v = 'Z' then d = date;
if v in ('A', 'B') and c = . then c = range(date, d);
end;
end;
do until (last.id);
set have;
by id;
if c = 0 | c > 30 then output;
end;
run;
Result:
id date var1 var2 var3 1 09/21/2016 B D D 1 09/28/2016 B D D 1 10/11/2016 Z 4 09/01/2017 A 4 09/07/2017 Z 4 10/10/2017 A
Anytime 🙂
@Manhort , I don't understand the question?
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data temp;
set have;
by id;
dif=dif(date);
if first.id then call missing(dif);
run;
data id;
set temp;
if .<dif<30 and (
(var1 in ('A' 'B') and lag(var1)='Z') or
(var2 in ('A' 'B') and lag(var2)='Z') or
(var3 in ('A' 'B') and lag(var3)='Z') ) ;
run;
proc sql;
create table want as
select * from have where id not in (select id from id);
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.