Hi again,
I have 1 rule: If Z occurs within 30 days before A or B on any var1,2,3 then remove all rows with that id. In this example persons 1 and 4 will remain, and 2 and 3 are removed. I'm having trouble with combining syntax for date comparisons and syntax for multiple conditions.
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
DATA want;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
Thanks!
Try this
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data want(drop = d1 - d3 c);
do until (last.id);
set have;
by id;
array v var:;
array d d1 - d3;
do over v;
if v = 'Z' then d = date;
if v in ('A', 'B') and c = . then c = range(date, d);
end;
end;
do until (last.id);
set have;
by id;
if c = 0 | c > 30 then output;
end;
run;
Result:
id date var1 var2 var3 1 09/21/2016 B D D 1 09/28/2016 B D D 1 10/11/2016 Z 4 09/01/2017 A 4 09/07/2017 Z 4 10/10/2017 A
Try this
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data want(drop = d1 - d3 c);
do until (last.id);
set have;
by id;
array v var:;
array d d1 - d3;
do over v;
if v = 'Z' then d = date;
if v in ('A', 'B') and c = . then c = range(date, d);
end;
end;
do until (last.id);
set have;
by id;
if c = 0 | c > 30 then output;
end;
run;
Result:
id date var1 var2 var3 1 09/21/2016 B D D 1 09/28/2016 B D D 1 10/11/2016 Z 4 09/01/2017 A 4 09/07/2017 Z 4 10/10/2017 A
Anytime 🙂
@Manhort , I don't understand the question?
DATA have;
input id date mmddyy10. var1 $ var2 $ var3 $ ;
format date mmddyy10.;
CARDS;
1 09/21/2016 B D D
1 09/28/2016 B D D
1 10/11/2016 Z . .
2 01/14/2016 D Z .
2 01/15/2016 Z D D
2 02/09/2016 A . .
3 09/20/2019 Z D .
3 10/02/2019 A . .
3 10/24/2019 A . .
4 09/01/2017 A . .
4 09/07/2017 Z . .
4 10/10/2017 A . .
;
run;
data temp;
set have;
by id;
dif=dif(date);
if first.id then call missing(dif);
run;
data id;
set temp;
if .<dif<30 and (
(var1 in ('A' 'B') and lag(var1)='Z') or
(var2 in ('A' 'B') and lag(var2)='Z') or
(var3 in ('A' 'B') and lag(var3)='Z') ) ;
run;
proc sql;
create table want as
select * from have where id not in (select id from id);
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and save with the early bird rate—just $795!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.