My data has ID and order_time, and I want to find out the order_time and the last order_time, the difference between which is > 30days.
My best code is as below; I wonder if there is a simpler way to do it?
data want;
set have ;
by ID order_time;
time1=lag(order_time);
dif_time=dif(order_time)/3600/24; /* convert seconds to days */
time2=order_time;
if first.ID then do;
time1=.;
dif_time=.;
time2=.;
end;
format time1 datetime.;
format time2 datetime.;
if dif_time >30; /* choose gaps >30 days */
run;
Simpler code with DO until()
data want;
do until(last.id);
set have ; by ID;
if first.id then first_order_time = order_time;
id last.id then do;
last_order_time = order_time;
if intck("DTDAY", first_order_time, last_order_time, "CONTINUOUS") > 30 then output;
end;
format first_order_time last_order_time datetime.;
keep ID first_order_time last_order_time;
run;
(untested)
Simpler code with DO until()
data want;
do until(last.id);
set have ; by ID;
if first.id then first_order_time = order_time;
id last.id then do;
last_order_time = order_time;
if intck("DTDAY", first_order_time, last_order_time, "CONTINUOUS") > 30 then output;
end;
format first_order_time last_order_time datetime.;
keep ID first_order_time last_order_time;
run;
(untested)
Thank you. This works (one "if" is mis-spelled as "id", and one "end" is missing).
I just realized that I actually want the first and last order_time within each time frame without gap, say,
t1 .... t2 -- over 30day gap -- t3 ... t4 -- over 30day gap -- t5 ... t6 -- over 30day gap -- t7...t8 ........
I want the pairs of (t1, t2), (t3, t4), (t5, t6). The above codes are to get the pairs (t2, t3) (t4, t5), etc.
This seems more difficult. Can any one help?
I think it can be done in two steps:
data temp;
set have; by id;
prevTime = lag(order_time);
if first.id then frame + 1;
else frame + intck("DTDAY", prevTime, order_time, "CONTINUOUS") > 30;
drop prevTime;
run;
data want;
do until(last.frame);
set temp; by ID frame;
if first.frame then first_order_time = order_time;
if last.frame then last_order_time = order_time;
end;
if first_order_time < last_order_time;
format first_order_time last_order_time datetime.;
keep ID first_order_time last_order_time;
run;
(untested)
Thank you, PGSTATS.
It works with minor revsision: The following line needs to be changed
if first.id then frame + 1;
if first.id then frame =0;
otherwise frame sum accross all IDs; by reset to 0 for each new ID, it only sum within the same ID.
Thank you very much for all your helsps.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.