I have dataset with 4 variables
Id, trt, dosmod, days…..I want to query in such a way that for a particular id and trt(group by)….when dosmod=’ INTERRUPTION OR DELAY’ then the next immediate record for that id and trt should be within 7 days of dosmod=’ INTERRUPTION OR DELAY’ or next immediate record should have dosmod=’ DRUG WITHDRAWN’….I SHOULD OUTPUT THOSE HIGHLIGHTED IN YELLOW
100030002 | xxxxxxxx | NONE | 1 |
100030002 | xxxxxxxx | 2 | |
100030002 | xxxxxxxx | NONE | 8 |
100030002 | xxxxxxxx | NONE | 15 |
100030002 | xxxxxxxx | INTERRUPTION OR DELAY | 24 |
100030002 | xxxxxxxx | NONE | 31 |
100030002 | xxxxxxxx | INTERRUPTION OR DELAY | 33 |
100030002 | xxxxxxxx | NONE | 38 |
100030002 | xxxxxxxx | INTERRUPTION OR DELAY | 45 |
100030002 | xxxxxxxx | NONE | 52 |
100030002 | xxxxxxxx | NONE | 59 |
100030002 | xxxxxxxx | INTERRUPTION OR DELAY | 66 |
100030002 | xxxxxxxx | INTERRUPTION OR DELAY | 73 |
100030002 | xxxxxxxx | NONE | 80 |
100030002 | xxxxxxxx | NONE | 87 |
100030002 | xxxxxxxx | NONE | 94 |
100030002 | xxxxxxxx | INTERRUPTION OR DELAY | 101 |
100030002 | xxxxxxxx | NONE | 107 |
100030002 | xxxxxxxx | NONE | 113 |
100030002 | yyyyyyyyy | NONE | 1 |
100030002 | yyyyyyyyy | NONE | 24 |
100030002 | yyyyyyyyy | NONE | 45 |
100030002 | yyyyyyyyy | INTERRUPTION OR DELAY | 66 |
100030002 | yyyyyyyyy | DRUG WITHDRAWN | 73 |
100030002 | yyyyyyyyy | INTERRUPTION OR DELAY | 94 |
100030002 | yyyyyyyyy | NONE | 99 |
100030002 | yyyyyyyyy | NONE | 121 |
100030002 | zzzzzzzz | NONE | 1 |
100030002 | zzzzzzzz | NONE | 2 |
100030002 | zzzzzzzz | NONE | 3 |
100030002 | zzzzzzzz | NONE | 24 |
100030002 | zzzzzzzz | NONE | 25 |
100030002 | zzzzzzzz | NONE | 26 |
100030002 | zzzzzzzz | NONE | 45 |
100030002 | zzzzzzzz | NONE | 46 |
100030002 | zzzzzzzz | NONE | 47 |
100030002 | zzzzzzzz | INTERRUPTION OR DELAY | 66 |
100030002 | zzzzzzzz | DRUG WITHDRAWN | 73 |
100030002 | zzzzzzzz | NONE | 74 |
100030002 | zzzzzzzz | NONE | 75 |
100030002 | zzzzzzzz | INTERRUPTION OR DELAY | 94 |
100030002 | zzzzzzzz | NONE | 99 |
100030002 | zzzzzzzz | NONE | 100 |
100030002 | zzzzzzzz | NONE | 101 |
100030002 | zzzzzzzz | NONE | 121 |
100030002 | zzzzzzzz | NONE | 122 |
100030002 | zzzzzzzz | NONE | 123 |
proc sort data=have;
by id trt days;
run;
data want (keep=id trt idosmod iday rename=(idosmod=dosmod iday=days));
set have;
by id trt;
idosmod = lag(dosmod);
iday = lag(days);
if not first.trt
then do;
if
idosmod = 'INTERRUPTION OR DELAY' and
dosmod ne 'DRUG WITHDRAWN' and
dosmod ne 'INTERRUPTION OR DELAY' and
days - iday >= 6
then do;
output;
idosmod = dosmod;
iday = days;
output;
end;
end;
run;
proc sort data=have;
by id trt days;
run;
data want (keep=id trt idosmod iday rename=(idosmod=dosmod iday=days));
set have;
by id trt;
idosmod = lag(dosmod);
iday = lag(days);
if not first.trt
then do;
if
idosmod = 'INTERRUPTION OR DELAY' and
dosmod ne 'DRUG WITHDRAWN' and
dosmod ne 'INTERRUPTION OR DELAY' and
days - iday >= 6
then do;
output;
idosmod = dosmod;
iday = days;
output;
end;
end;
run;
Hi, Thanks for the reply......the solution works near perfect for me....only additional thing which i would require for representation purpose is that...... I am adding STDTC(date variable) to the dataset...when I included that variable results are coming as expected except for the dates..the days difference should reflect in the dates part(STDTC). can you help me on this .....i was using below code...
proc sort data=have;
by id trt stdtc days;
run;
data want (keep=id trt idosmod stdtc iday rename=(idosmod=dosmod iday=days));
set have;
by id trt;
idosmod = lag(dosmod);
iday = lag(days);
if not first.trt
then do;
if
idosmod = 'INTERRUPTION OR DELAY' and
dosmod ne 'DRUG WITHDRAWN' and
dosmod ne 'INTERRUPTION OR DELAY' and
days - iday >= 6
then do;
output;
idosmod = dosmod;
iday = days;
output;
end;
end;
run;
You need to do the same with stdtc that I did with dosmod and days. Create a temporary variable that holds the value from the lag() function and rename it in the output dataset.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.