Here is the data I have
data have;
input id $ treatment $ linenumber date yymmdd10. ;
format date yymmdd10.;
cards;
001 A 1 2014-08-15
001 A 1 2015-06-20
001 A 2 2015-08-29
001 A 2 2018-02-28
001 A 2 2018-05-07
001 B 3 2018-08-01
001 B 4 2019-02-01
001 C 5 2019-03-03
;
run;
I want to calculate the gap between the dates, and then create two flags:
within the same ID/Treatment/Linenumber, if gap >90 days then GT90_within_line =1;
within the same ID/Treatment, when linenumber changes, if the gap < 90 days, then LT90_btw_line_within_trt = 1
The expected output data will be like this:
Can anyone help? Many thanks in advance!
I have a solution but wonder if there is simpler methods
data have;
input id $ treatment $ linenumber date yymmdd10. ;
format date yymmdd10.;
cards;
001 A 1 2014-08-15
001 A 1 2015-06-20
001 A 2 2015-08-29
001 A 2 2018-02-28
001 A 2 2018-05-07
001 B 3 2018-08-01
001 B 4 2019-02-01
001 C 5 2019-03-03
002 A 1 2015-06-27
002 A 1 2015-09-02
002 A 2 2016-01-01
;
run;
proc sort data=have;
by id treatment linenumber date;
run;
data want;
set have;
by id treatment linenumber date;
gap=dif(date);
if first.id then
gap=.;
run;
proc sql;
create table step1 as
select *, case when count(date)>=2 and gap > 90 then 1 else . end as GT90_within_line
from want
group by id, treatment, linenumber
order by id, treatment, linenumber, date;
quit;
data step2;
set step1;
by id treatment linenumber date;
if not first.treatment and first.linenumber and . < gap <90 then
LT90_btw_line_within_trt =1;
run;
I have a solution but wonder if there is simpler methods
data have;
input id $ treatment $ linenumber date yymmdd10. ;
format date yymmdd10.;
cards;
001 A 1 2014-08-15
001 A 1 2015-06-20
001 A 2 2015-08-29
001 A 2 2018-02-28
001 A 2 2018-05-07
001 B 3 2018-08-01
001 B 4 2019-02-01
001 C 5 2019-03-03
002 A 1 2015-06-27
002 A 1 2015-09-02
002 A 2 2016-01-01
;
run;
proc sort data=have;
by id treatment linenumber date;
run;
data want;
set have;
by id treatment linenumber date;
gap=dif(date);
if first.id then
gap=.;
run;
proc sql;
create table step1 as
select *, case when count(date)>=2 and gap > 90 then 1 else . end as GT90_within_line
from want
group by id, treatment, linenumber
order by id, treatment, linenumber, date;
quit;
data step2;
set step1;
by id treatment linenumber date;
if not first.treatment and first.linenumber and . < gap <90 then
LT90_btw_line_within_trt =1;
run;
Hi @fengyuwuzu
It can be done in one data step using lag function, if the input data set is sorted on ID Treatment Linenumber Date.
data want (drop=oID oTreatment oDate);
set have; by ID Treatment Linenumber;
oID = lag(ID);
oTreatment = lag(Treatment);
oLinenumber = lag(Linenumber);
oDate = lag(Date);
if not first.ID then Gap = Date-oDate;
if not first.Linenumber then do;
if Gap > 90 then GT90_within_line = 1;
end;
else if not first.Treatment then if Gap < 90 then LT90_btw_line_within_trt = 1;
run;
Thank you!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.