Hi guys,
I am currently working on a sample where the data is categorized by date (YYYYMMDD) and time (seconds past midnight). However, there isn't necessarily an observation for every second of each day. Is there any code I can use to create a table from the sample which would bring up the closest observation following a specific TIME selection for each day? For example, say I need the first observation immediately following 10:30am for each day. Any ideas?
Thanks in advance, your help is greatly appreciated.
If order is important, and your data are already in the desired order, then I would suggest just using a datastep. e.g.:
data want;
set have (where=(time gt 37800));
by date;
if first.date;
run;
Proc expand can probably do it, but you need to be a bit more specific. Do you want records for each day, hour, minute, second, or even finer?
I'm looking at trade data from the stock market. The time of each trade that takes place is recorded in terms of seconds after midnight. So basically each DATE (YYYYMMDD) is going to have thousands of trades at different times. I simply need to extract from this huge data sample the first trade that happened immediately after 10:30am (which is 37800 in terms of seconds after midnight) for each day.
My columns are as follows:
DATE TIME (Secs past midnight) Price
-------- ------------------------------------- ---------
So I need to create a table from the data that brings up 1 observation for each DATE. This observation has to be the price of the first trade that happened immediately after 10:30am (37800).
Totally different than what I was expecting. Would something like the following suffice?:
data have;
informat date date9.;
format date date9.;
input date time price;
cards;
6nov2012 37790 1
6nov2012 37800 2
6nov2012 37801 3
6nov2012 37802 4
7nov2012 37790 1
7nov2012 37800 2
7nov2012 37801 3
7nov2012 37802 4
8nov2012 37790 1
8nov2012 37800 2
8nov2012 37801 3
8nov2012 37802 4
;
proc sql;
create table want as
select *
from (select *
from have
where time gt 37800)
group by date
having time eq min(time)
;
quit;
Thank you so much Arthur. The code does work; however, there is still one slight setback. Since I'm looking at trades, there are some instances where a specific TIME has multiple trades. For example 10 trades may have occured at 37801. The code above works, but it brings up every trade that happened at that time. I want it to bring up the first observation that appears in the data set at that time (the first trade that appears at 37801 in this case) regardless of how many trades occured at 37801.
If order is important, and your data are already in the desired order, then I would suggest just using a datastep. e.g.:
data want;
set have (where=(time gt 37800));
by date;
if first.date;
run;
That's the one. Thank you, greatly appreciated Arthur.
You only want one of them ,not all ?
data have; informat date date9.; format date date9.; input date time price; cards; 6nov2012 37790 1 6nov2012 37800 2 6nov2012 37801 3 6nov2012 37802 4 7nov2012 37790 1 7nov2012 37800 2 7nov2012 37801 3 7nov2012 37802 4 8nov2012 37790 1 8nov2012 37800 2 8nov2012 37801 3 8nov2012 37802 4 ; run; proc sort data=have;by date time;run; data want; set have; by date; retain found 0; if first.date then found=0; if not found and time gt 37800 then do;output;found=1;end; run;
Ksharp
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.