Re: How to insert missing datetime records

Jaap_K · Posted 06-19-2024 05:51 AM

Hi all,

I've a dataset with start / end datetimes. Sometimes there is a gap between two records. I want to insert a record where this new record fills the gap between both records. In the example below, a record must be added with start date_time 02FEB2023:22:03:00 and end date_time 03FEB2023:13:23:00.

Who can help me with this code?

Thanks!

yabwon · Posted 06-19-2024 05:55 AM

You wrote: "must be added with start date_time 02FEB2023:22:03:00 and end date_time 03FEB2023:13:23:00."

Are those dates calculated somehow? (e.g. average of previous and next) or they are fixed (always the same)?

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation

Jaap_K · Posted 06-19-2024 06:39 AM

Hi Bart,

The dataset is sorted and is fixed. So you must look at the row above.

yabwon · Posted 06-19-2024 06:50 AM

data have;
input start_date end_date;
format start_date end_date datetime21.;
cards;
1 2
3 4
. 6
7 .
;
run;
proc print;
run;

data want;
 set have;
 prev_s=lag(start_date);
 prev_e=lag(end_date);

 start_date = coalesce(start_date,prev_s); 
 end_date = coalesce(end_date,prev_e);

 drop prev_:;
run;
proc print;
run;

Truy this.

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation

Jaap_K · Posted 06-19-2024 07:13 AM

Hi,

That's not exactly what I mean. See Excelfile. Sometimes there is a time difference between two records. In the 'want' example you see new records that 'resolve' the timegap. I'm looking for the code who create these records.

Tom · Posted 06-19-2024 12:44 PM

Why bother going to the trouble to make an Excel file and then attach the file to your post? It is much easier for you (and much much easier for someone to help you) if you just post the SAS code to make the dataset(s) directly.

data have;
  input (s e) (:datetime.);
  format s e datetime19.;
cards;
02FEB2023:20:44:00 02FEB2023:20:48:00 
02FEB2023:20:48:00 02FEB2023:22:03:00 
03FEB2023:13:23:00 03FEB2023:15:21:00 
03FEB2023:15:21:00 03FEB2023:17:21:00 
03FEB2023:17:25:00 03FEB2023:17:26:00 
03FEB2023:17:26:00 03FEB2023:18:10:00 
05FEB2023:01:18:00 05FEB2023:02:46:00
;


data expect;
  infile cards truncover ;
  input (s e) (:datetime.) (2*flag) (:$8.);
  format s e datetime19.;
cards;
02FEB2023:20:44:00 02FEB2023:20:48:00
02FEB2023:20:48:00 02FEB2023:22:03:00
02FEB2023:22:03:00 03FEB2023:13:23:00 --> New
03FEB2023:13:23:00 03FEB2023:15:21:00
03FEB2023:15:21:00 03FEB2023:17:21:00
03FEB2023:17:21:00 03FEB2023:17:25:00 --> New
03FEB2023:17:25:00 03FEB2023:17:26:00
03FEB2023:17:26:00 03FEB2023:18:10:00
03FEB2023:18:10:00 05FEB2023:01:18:00 --> New
05FEB2023:01:18:00 05FEB2023:02:46:00
;

Just remember the previous end datetime value so that when there is a gap you can add another observation. It is simplest if you sort the result later.

data want;
  set have ;
  output;
  lag_e=lag(e);
  if lag_e < s and _n_>1 then do;
    e=s;
    s=lag_e;
    output;
  end;
  drop lag_e;
run;

proc sort;
  by s e;
run;

But you could get a little tricky and write the new observation before the old. For example by adding an extra SET to re-read the current observation to get the original dates back.

data want;
  set have ;
  lag_e=lag(e);
  if lag_e < s and _n_>1 then do;
    e=s;
    s=lag_e;
    output;
    set have point=_n_;
  end;
  output;
  drop lag_e;
run;

PaigeMiller · Posted 06-19-2024 08:08 AM

@Jaap_K wrote:

The dataset is sorted and is fixed. So you must look at the row above.

and then what, once I looked at the row above? What is the calculation or algorithm?

--
Paige Miller

mkeintz · Posted 06-19-2024 09:15 PM

Use merge with a "FIRSTOBS=2" option to look ahead at the next start value:

data have;
  startdate_time='02feb2023:19:08:00'dt; enddate_time='02feb2023:20:44:00'dt; output;
  startdate_time='02feb2023:20:44:00'dt; enddate_time='02feb2023:20:48:00'dt; output;
  startdate_time='02feb2023:20:48:00'dt; enddate_time='02feb2023:22:03:00'dt; output;
  startdate_time='03feb2023:13:23:00'dt; enddate_time='03feb2023:15:21:00'dt; output;
  format startdate_time enddate_time datetime18.0;
run;

data want (drop=nxt_:);
  merge have  have (firstobs=2 keep=startdate_time rename=(startdate_time=nxt_s));
  output;
  if enddate_time<nxt_s;
  startdate_time=enddate_time;
  enddate_time=nxt_s;
  output;
run;

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Registration is open

SAS Training: Just a Click Away