Solved: How to remove first rows in a specific period of time?

aminkarimid · Posted 11-29-2017 01:08 PM

Hello everybody,

I have a table which has name, date and time variables. I want to do operation which is mentioned below:

If 9:00:01 <= time <= 9:05:59 then delete the first row in the interval. If a time value of next rows is the same with first row in this interval then delete all of them for that name and date variables.

Here is an example of my data:

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time";
datalines4;
A, 7-May-08, 09:01:41
A, 7-May-08, 09:01:41
A, 7-May-08, 09:04:41
A, 7-May-08, 11:32:41
A, 8-May-08, 09:06:00
A, 8-May-08, 09:06:01
A, 8-May-08, 12:32:41
B, 7-May-08, 09:00:01
B, 7-May-08, 09:00:01
B, 7-May-08, 11:33:41
B, 9-May-08, 09:05:59
B, 9-May-08, 11:35:41
B, 9-May-08, 11:36:41
B, 9-May-08, 11:37:41
B, 12-May-08, 11:27:41
B, 12-May-08, 11:27:41
;;;;
run;

So, the result is:

name	date	time
A	7-May-08	9:04:41
A	7-May-08	11:32:41
A	8-May-08	9:06:00
A	8-May-08	9:06:01
A	8-May-08	12:32:41
B	7-May-08	11:33:41
B	9-May-08	11:35:41
B	9-May-08	11:36:41
B	9-May-08	11:37:41
B	12-May-08	11:27:41
B	12-May-08	11:27:41

How can I do that?

Thanks in advance.

aminkarimid · Posted 11-30-2017 04:01 PM

Based on Petr's code, I think the better solution is:

proc sql;
create table want as select *
from have
group by name, date
having min(time) not between '09:00:01't and '09:05:59't
    or time ne min(time)
;quit;

View solution in original post

PaigeMiller · Posted 11-29-2017 01:16 PM

You have pretty much written the code to do this in your text when you said

9:00:01 <= time <= 9:05:59

All you have to do is turn this into a proper IF statement.

--
Paige Miller

aminkarimid · Posted 11-29-2017 02:54 PM

I don't know how to write the code which deletes next rows with the same time value with the first row in this interval. because "FIRST" statement just deletes the first observation.

Reeza · Posted 11-29-2017 01:20 PM

@aminkarimid wrote:

Hello everybody,

I want to do operation which is mentioned below:

If 9:00:01 <= time <= 9:05:59 then delete the first row in the interval. If a time value of next rows is the same with first row in this interval then delete all of them.

What are you defining as an 'interval'? There's no interval variable or clear definition.

aminkarimid · Posted 11-29-2017 02:45 PM

The interval is:
9:00:01 <= time <= 9:05:59

Reeza · Posted 11-29-2017 02:51 PM

So that's a 6 minute range, is it for only that range OR for multiple intervals of 6 minutes or just the last record each hour/6 minutes.

It sounds like you want to group that interval and then take the last value, where it's not duplicate? Or just remove the first value? If the data was like this what would happen?

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time";
datalines4;
A, 7-May-08, 09:01:41
A, 7-May-08, 09:01:41
A, 7-May-08, 09:03:20
A, 7-May-08, 09:04:41
A, 7-May-08, 11:32:41
A, 8-May-08, 09:06:00
A, 8-May-08, 09:06:01
A, 8-May-08, 12:32:41
B, 7-May-08, 09:00:01
B, 7-May-08, 09:00:01
B, 7-May-08, 11:33:41
B, 9-May-08, 09:05:59
B, 9-May-08, 11:35:41
B, 9-May-08, 11:36:41
B, 9-May-08, 11:37:41
B, 12-May-08, 11:27:41
B, 12-May-08, 11:27:41
;;;;
run;

aminkarimid · Posted 11-29-2017 03:06 PM

1- Yes, just 6 minutes (consider time interval which is shown above [9:00:01, 9:05:59] ).
2- No, If I group that interval then I want to delete first one. Because I'm working with high frequency data and there is more data into this interval with different time value.
3- So, the result will be what is shown in the question which the first group in this interval for any name and date is deleted.

Reeza · Posted 11-29-2017 03:13 PM

It's not the same data - I added a record.

aminkarimid · Posted 11-29-2017 03:25 PM

Yes, I just found out.
So, based on what you added in data, for "A" and "7-May-08" date value, "09:01:41" and "09:01:41" should be deleted and observations, which is shown below, are retained.
A, 7-May-08, 09:03:20
A, 7-May-08, 09:04:41
A, 7-May-08, 11:32:41

Reeza · Posted 11-30-2017 12:19 PM

This works fine for me. Not sure if it will scale though, I'll leave the efficiency operation up to you.

data _temp;
set tableA;
by name date time;
where time between hms(9, 0, 1) and hms(9, 5, 59);
if first.date then counter=0;
if first.time then counter+1;
run;

proc sql;
create table tableB as
select * from tableA
except all  (select name, date, time from _temp where counter=1);
quit;

aminkarimid · Posted 11-30-2017 04:01 PM

Based on Petr's code, I think the better solution is:

proc sql;
create table want as select *
from have
group by name, date
having min(time) not between '09:00:01't and '09:05:59't
    or time ne min(time)
;quit;

How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Re: How to remove first rows in a specific period of time?

Catch up on SAS Innovate 2026

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away