SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Deleting days with observations only up to 17:00

Accepted Solution Solved
Reply
Contributor
Posts: 22
Accepted Solution

Deleting days with observations only up to 17:00

Dear sas community,

 

I have a huge dataset containing forex tick data, which is organised as follows:

 

Time                                      Bid                        Ask .....
01.01.2015:17:12:12.445   xxxxxxxxxx              xxxxxxxxx
01.01.2015:17:13:32.565   xxxxxxxxxx              xxxxxxxxx
01.01.2015:17:13:40.685   xxxxxxxxxx              xxxxxxxxx
01.01.2015:17:14:59.895   xxxxxxxxxx              xxxxxxxxx
01.01.2015:17:15:22.354    xxxxxxxxxx              xxxxxxxxx
 
My Problem is that some days e.g Sundays only have observations from 17:00 until midnight and other days only have observations from 00:00 until 17:00. To be able to work with the data, these days have to be removed. I tried the folowing:
 
 
data forex;
set forex;
seconds_time  = floor(Time);
days = datepart(Time);
run;

* removing days with observations only past 17:00

data forex2;
  set forex;
  by days seconds_time;
  retain del_flag;
  if first.days and timepart(seconds_time) ge '17:00't then del_flag=1;
  else if first.days then del_flag=0;
  if del_flag=1 then delete;
  drop del_flag;
run;

* removing days with observations only before 17:00

data forex2;
  set forex2;
  by days seconds_time;
  retain del_flag;
  if last.days and timepart(seconds_time) le '17:00't then del_flag=1;
  else if last.days then del_flag=0;
  if del_flag=1 then delete;
  drop del_flag;
run;
 
 
This does not seem to work, however.
 
Any help is greatly appreciated!
 
Kind regards!
 

Accepted Solutions
Solution
‎04-24-2018 02:55 PM
PROC Star
Posts: 1,584

Re: Deleting days with observations only up to 17:00

[ Edited ]
Posted in reply to NewSASuser2018
data temp;
set have;
t=datepart(time);
t1=timepart(time);
format t date9. t1 time.;
run;

data want;
call missing(flag);
do until(last.t);
set temp;
by t t1;
if first.t and t1>='17:00't then flag=1;
else if last.t and t1<='17:00't  then flag=1;
end;
do until(last.t);
set temp;
by t t1;
if not flag  then output;
end;
drop t:;
run;

View solution in original post


All Replies
Super User
Posts: 13,321

Re: Deleting days with observations only up to 17:00

Posted in reply to NewSASuser2018

You might look at Hour(time) < 17 or hour(time) ge 17 instead of pulling the variable apart

 

You could also consider weeday(datepart(time)) to consider which day of the week to process.

 

Doesn't work is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the {i} to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of a dataset, the actual results and the expected results. Data should be in the form of a data step. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

Since the TIME varial you show is a non-standard datetime format though the first thing is likely to ensure your time is actually a SAS datetime value.

Contributor
Posts: 22

Re: Deleting days with observations only up to 17:00

[ Edited ]

Thanks for the quick response! sample data is now attached!

 

There are no errors in the log. SaS does, however seem to flag random observations. So basically it does not flag a day where the last observation is at 16:59:30.3332 but might flag a day where the last observation is well past 17:00.

 

 

P.S. this is my import statement:

proc import 
datafile=df01 out=forex DBMS=csv replace; 
delimiter=',';
run;quit;

data forex;
set forex;
format Time datetime24.3;
run;

 

 

Contributor
Posts: 22

Re: Deleting days with observations only up to 17:00

Although your hint regarding me possibly having selected an unsuitable time format, if I apply the hour function as you suggested sas extracts the hours correctly.
Super User
Posts: 6,632

Re: Deleting days with observations only up to 17:00

Posted in reply to NewSASuser2018

Here's an approach that should work:

 

  • For each observation, separate out DAY and TIME.
  • Create a summary data set:  for each DAY, get the minimum and maximum TIME.
  • Use your rules to determine whether that DAY should be kept or deleted.
  • MERGE your summary data set back into the original data set, and apply your deletion decisions.
Contributor
Posts: 22

Re: Deleting days with observations only up to 17:00

Posted in reply to Astounding
This seems like a good idea! I am new to sas, however, or at least have very little experience and struggle with the syntax. Could you kindly provide a more detailed describtion?
Super User
Posts: 6,632

Re: Deleting days with observations only up to 17:00

[ Edited ]
Posted in reply to NewSASuser2018

It seems you have seen the DATEPART function.  If that's working properly, there is also TIMEPART:

 

data temp;

set have;

just_day = datepart(time);

just_time = timepart(time);

run;

 

Summarizing data is a must-learn topic:

 

proc summary data=temp nway;

class day;

var time;

output out=daily_stats (keep=day min_time max_time) min=min_time max=max_time;

run;

 

You will need to process this:

 

data daily_stats2;

set daily_stats;

*** Examine MIN_TIME and MAX_TIME to set a deletion flag;

keep day deletion_flag;

run;

 

data want;

merge daily_stats2 temp;

by day;

if deletion_flag=1 then delete;

run;

 

That's the idea of it, with most of the code in place.  You still need to examine the values in the MIN_TIME and MAX_TIME variables, and come up with logic that determines which days get deleted.

 

The last step assumes that your original data will be in order by date.  If that's not the case, you will need to sort it before merging.

Solution
‎04-24-2018 02:55 PM
PROC Star
Posts: 1,584

Re: Deleting days with observations only up to 17:00

[ Edited ]
Posted in reply to NewSASuser2018
data temp;
set have;
t=datepart(time);
t1=timepart(time);
format t date9. t1 time.;
run;

data want;
call missing(flag);
do until(last.t);
set temp;
by t t1;
if first.t and t1>='17:00't then flag=1;
else if last.t and t1<='17:00't  then flag=1;
end;
do until(last.t);
set temp;
by t t1;
if not flag  then output;
end;
drop t:;
run;
PROC Star
Posts: 1,584

Re: Deleting days with observations only up to 17:00

Posted in reply to NewSASuser2018

More fun:

 

proc sql;
create table want as
select *
from have
group by datepart(time)
having not(min(timepart(time))>='17:00't) and  not(max(timepart(time))<='17:00't);
quit;
Contributor
Posts: 22

Re: Deleting days with observations only up to 17:00

Posted in reply to novinosrin
Thanks for all the great and very helpful answers! I appreciate it.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 266 views
  • 5 likes
  • 4 in conversation