BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aminkarimid
Lapis Lazuli | Level 10

Hello everybody,

 

I have a table which has name, date and time variables. I want to do operation which is mentioned below:

 

If  9:00:01 <= time <= 9:05:59 then delete the first row in the interval. If a time value of next rows is the same with first row in this interval then delete all of them for that name and date variables.

 

Here is an example of my data:  

 

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time";
datalines4;
A, 7-May-08, 09:01:41
A, 7-May-08, 09:01:41
A, 7-May-08, 09:04:41
A, 7-May-08, 11:32:41
A, 8-May-08, 09:06:00
A, 8-May-08, 09:06:01
A, 8-May-08, 12:32:41
B, 7-May-08, 09:00:01
B, 7-May-08, 09:00:01
B, 7-May-08, 11:33:41
B, 9-May-08, 09:05:59
B, 9-May-08, 11:35:41
B, 9-May-08, 11:36:41
B, 9-May-08, 11:37:41
B, 12-May-08, 11:27:41
B, 12-May-08, 11:27:41
;;;;
run;

 

 

So, the result is:

 

namedatetime
A7-May-089:04:41
A7-May-0811:32:41
A8-May-089:06:00
A8-May-089:06:01
A8-May-0812:32:41
B7-May-0811:33:41
B9-May-0811:35:41
B9-May-0811:36:41
B9-May-0811:37:41
B12-May-0811:27:41
B12-May-0811:27:41

 

How can I do that?

 

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
aminkarimid
Lapis Lazuli | Level 10

Based on Petr's code, I think the better solution is:

 

proc sql;
create table want as select *
from have
group by name, date
having min(time) not between '09:00:01't and '09:05:59't
    or time ne min(time)
;quit;

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

You have pretty much written the code to do this in your text when you said

 

9:00:01 <= time <= 9:05:59

 

All you have to do is turn this into a proper IF statement.

--
Paige Miller
aminkarimid
Lapis Lazuli | Level 10
I don't know how to write the code which deletes next rows with the same time value with the first row in this interval. because "FIRST" statement just deletes the first observation.
Reeza
Super User

@aminkarimid wrote:

Hello everybody,

I want to do operation which is mentioned below:

If  9:00:01 <= time <= 9:05:59 then delete the first row in the interval. If a time value of next rows is the same with first row in this interval then delete all of them.

 

 

What are you defining as an 'interval'? There's no interval variable or clear definition.

aminkarimid
Lapis Lazuli | Level 10
The interval is:
9:00:01 <= time <= 9:05:59
Reeza
Super User

So that's a 6 minute range, is it for only that range OR for multiple intervals of 6 minutes or just the last record each hour/6 minutes. 

 

It sounds like you want to group that interval and then take the last value, where it's not duplicate?  Or just remove the first value? If the data was like this what would happen?

data WORK.TABLEA;
infile datalines dsd truncover;
input name:$3. date:DATE9. time:TIME8.;
format date DATE9. time TIME8.;
label name="name" date="date" time="time";
datalines4;
A, 7-May-08, 09:01:41
A, 7-May-08, 09:01:41
A, 7-May-08, 09:03:20
A, 7-May-08, 09:04:41
A, 7-May-08, 11:32:41
A, 8-May-08, 09:06:00
A, 8-May-08, 09:06:01
A, 8-May-08, 12:32:41
B, 7-May-08, 09:00:01
B, 7-May-08, 09:00:01
B, 7-May-08, 11:33:41
B, 9-May-08, 09:05:59
B, 9-May-08, 11:35:41
B, 9-May-08, 11:36:41
B, 9-May-08, 11:37:41
B, 12-May-08, 11:27:41
B, 12-May-08, 11:27:41
;;;;
run;

 

 

aminkarimid
Lapis Lazuli | Level 10

1- Yes, just 6 minutes (consider time interval which is shown above [9:00:01, 9:05:59] ).
2- No, If I group that interval then I want to delete first one. Because I'm working with high frequency data and there is more data into this interval with different time value.
3- So, the result will be what is shown in the question which the first group in this interval for any name and date is deleted.

Reeza
Super User

 It's not the same data - I added a record.

aminkarimid
Lapis Lazuli | Level 10

Yes, I just found out.
So, based on what you added in data, for "A" and "7-May-08" date value, "09:01:41" and "09:01:41" should be deleted and observations, which is shown below, are retained.
A, 7-May-08, 09:03:20
A, 7-May-08, 09:04:41
A, 7-May-08, 11:32:41

Reeza
Super User

This works fine for me. Not sure if it will scale though, I'll leave the efficiency operation up to you.

 

data _temp;
set tableA;
by name date time;
where time between hms(9, 0, 1) and hms(9, 5, 59);
if first.date then counter=0;
if first.time then counter+1;
run;

proc sql;
create table tableB as
select * from tableA
except all  (select name, date, time from _temp where counter=1);
quit;
aminkarimid
Lapis Lazuli | Level 10

Based on Petr's code, I think the better solution is:

 

proc sql;
create table want as select *
from have
group by name, date
having min(time) not between '09:00:01't and '09:05:59't
    or time ne min(time)
;quit;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 2273 views
  • 0 likes
  • 3 in conversation