My data has ID and order_time, and I want to find out the order_time and the last order_time, the difference between which is > 30days.
My best code is as below; I wonder if there is a simpler way to do it?
data want;
set have ;
by ID order_time;
time1=lag(order_time);
dif_time=dif(order_time)/3600/24; /* convert seconds to days */
time2=order_time;
if first.ID then do;
time1=.;
dif_time=.;
time2=.;
end;
format time1 datetime.;
format time2 datetime.;
if dif_time >30; /* choose gaps >30 days */
run;
Simpler code with DO until()
data want;
do until(last.id);
set have ; by ID;
if first.id then first_order_time = order_time;
id last.id then do;
last_order_time = order_time;
if intck("DTDAY", first_order_time, last_order_time, "CONTINUOUS") > 30 then output;
end;
format first_order_time last_order_time datetime.;
keep ID first_order_time last_order_time;
run;
(untested)
Simpler code with DO until()
data want;
do until(last.id);
set have ; by ID;
if first.id then first_order_time = order_time;
id last.id then do;
last_order_time = order_time;
if intck("DTDAY", first_order_time, last_order_time, "CONTINUOUS") > 30 then output;
end;
format first_order_time last_order_time datetime.;
keep ID first_order_time last_order_time;
run;
(untested)
Thank you. This works (one "if" is mis-spelled as "id", and one "end" is missing).
I just realized that I actually want the first and last order_time within each time frame without gap, say,
t1 .... t2 -- over 30day gap -- t3 ... t4 -- over 30day gap -- t5 ... t6 -- over 30day gap -- t7...t8 ........
I want the pairs of (t1, t2), (t3, t4), (t5, t6). The above codes are to get the pairs (t2, t3) (t4, t5), etc.
This seems more difficult. Can any one help?
I think it can be done in two steps:
data temp;
set have; by id;
prevTime = lag(order_time);
if first.id then frame + 1;
else frame + intck("DTDAY", prevTime, order_time, "CONTINUOUS") > 30;
drop prevTime;
run;
data want;
do until(last.frame);
set temp; by ID frame;
if first.frame then first_order_time = order_time;
if last.frame then last_order_time = order_time;
end;
if first_order_time < last_order_time;
format first_order_time last_order_time datetime.;
keep ID first_order_time last_order_time;
run;
(untested)
Thank you, PGSTATS.
It works with minor revsision: The following line needs to be changed
if first.id then frame + 1;
if first.id then frame =0;
otherwise frame sum accross all IDs; by reset to 0 for each new ID, it only sum within the same ID.
Thank you very much for all your helsps.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.