I am trying to find the return over a 5 minute period by using proc expand in SAS. I am working on financial time series data. The time data is in hr:min:sec. Sample data is given below:
date Co time price vol
01JAN2012 ABC 9:55:59 166.00 100
01JAN2012 ABC 9:56:02 165.40 50
01JAN2012 ABC 9:56:02 165.40 25
01JAN2012 ABC 9:56:03 166.45 700
01JAN2012 ABC 9:56:03 166.80 20
01JAN2012 ABC 9:56:03 167.00 280
01JAN2012 ABC 9:56:06 166.45 1
01JAN2012 ABC 9:56:06 166.05 24
01JAN2012 ABC 9:56:15 166.50 500
01JAN2012 ABC 9:56:16 166.50 25
01JAN2012 ABC 9:56:17 166.50 475
01JAN2012 ABC 9:56:22 166.25 10
01JAN2012 ABC 9:56:26 166.05 50
01JAN2012 ABC 9:56:30 166.25 20
01JAN2012 ABC 9:56:38 166.05 126
01JAN2012 ABC 9:56:40 166.00 100
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:43 166.00 1
01JAN2012 ABC 9:56:46 166.00 2
But the output throws up 'time' as 0,1,2,3.... when I run the following code:
data temp;set alldata_sas7bdat; proc expand data=temp out=lagtst method=none; convert time=time_lag1 / transform=(lag 1); convert price=price_lag1 / transform=(lag 1); by date; run; proc print data=lagtst (obs=5000); run;
proc delete data=temp; data lagtst_2;set lagtst; timegap = time-time_lag1; if timegap > 0 and timegap < 5 then return=log(price/price_lag1); else return=.; run; While I have tried to calculate return over a '5 minute' period, the data I have posted here is not enough. I can provide more data but I wonder if it would be too cumbersome here. So please suggest a method for finding out return of a '5 second' period. I will then make changes to find return over a 5-minute period. Thanks.
You can also achieve it by using sql. Here is for 5-seconds period .
data have; input date :date9. Co $ time : time9. price vol ; format date date9. time time.; cards; 01JAN2012 ABC 9:55:59 166.00 100 01JAN2012 ABC 9:56:02 165.40 50 01JAN2012 ABC 9:56:02 165.40 25 01JAN2012 ABC 9:56:03 166.45 700 01JAN2012 ABC 9:56:03 166.80 20 01JAN2012 ABC 9:56:03 167.00 280 01JAN2012 ABC 9:56:06 166.45 1 01JAN2012 ABC 9:56:06 166.05 24 01JAN2012 ABC 9:56:15 166.50 500 01JAN2012 ABC 9:56:16 166.50 25 01JAN2012 ABC 9:56:17 166.50 475 01JAN2012 ABC 9:56:22 166.25 10 01JAN2012 ABC 9:56:26 166.05 50 01JAN2012 ABC 9:56:30 166.25 20 01JAN2012 ABC 9:56:38 166.05 126 01JAN2012 ABC 9:56:40 166.00 100 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:43 166.00 1 01JAN2012 ABC 9:56:46 166.00 2 ; run; proc sql; create table want as select *,log(price/(select price from have where date=a.date and time+5=a.time)) as return from have as a; quit;
Xia Keshan
Thank you very much..
However, the program gives a blank 'return'.
date stk time price vol return
01JAN2003 ABC 9:55:59 166.00 100 .
I have used the program giving the path to the datafiles.
libname in 'c:\Users\VLM\Stk data';
data temp;set in.ABC_sas7bdat;
run;
proc sql;
create table want as
select *,log(price/(select price from temp where date=temp.date and time+5=temp.time)) as return
from temp;
quit;
That is because there is no price at that time . haikuo bian has already pointed out it .
I make a TEMP dataset to hold every second price value.
data have; input date :date9. Co $ time : time9. price vol ; format date date9. time time.; cards; 01JAN2012 ABC 9:55:59 166.00 100 01JAN2012 ABC 9:56:02 165.40 50 01JAN2012 ABC 9:56:02 165.40 25 01JAN2012 ABC 9:56:03 166.45 700 01JAN2012 ABC 9:56:03 166.80 20 01JAN2012 ABC 9:56:03 167.00 280 01JAN2012 ABC 9:56:06 166.45 1 01JAN2012 ABC 9:56:06 166.05 24 01JAN2012 ABC 9:56:15 166.50 500 01JAN2012 ABC 9:56:16 166.50 25 01JAN2012 ABC 9:56:17 166.50 475 01JAN2012 ABC 9:56:22 166.25 10 01JAN2012 ABC 9:56:26 166.05 50 01JAN2012 ABC 9:56:30 166.25 20 01JAN2012 ABC 9:56:38 166.05 126 01JAN2012 ABC 9:56:40 166.00 100 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:43 166.00 1 01JAN2012 ABC 9:56:46 166.00 2 ; run; data temp; merge have have(firstobs=2 keep=date time rename=(date=_date time=_time)); output; if date eq _date then do; do i=time+1 to _time-1; time=i;output; end; end; run; proc sql; create table want as select *,log(price/(select price from temp where date=a.date and time+5=a.time)) as return from have as a; quit;
Xia Keshan
's code assumes that you will have one unique data entry per second, obviously it is not the case, but the logic of his code still holds:
data have;
input date :date9. Co $ time : time9. price vol;
format date date9. time time.;
cards;
01JAN2012 ABC 9:55:59 166.00 100
01JAN2012 ABC 9:56:02 165.40 50
01JAN2012 ABC 9:56:02 165.40 25
01JAN2012 ABC 9:56:03 166.45 700
01JAN2012 ABC 9:56:03 166.80 20
01JAN2012 ABC 9:56:03 167.00 280
01JAN2012 ABC 9:56:06 166.45 1
01JAN2012 ABC 9:56:06 166.05 24
01JAN2012 ABC 9:56:15 166.50 500
01JAN2012 ABC 9:56:16 166.50 25
01JAN2012 ABC 9:56:17 166.50 475
01JAN2012 ABC 9:56:22 166.25 10
01JAN2012 ABC 9:56:26 166.05 50
01JAN2012 ABC 9:56:30 166.25 20
01JAN2012 ABC 9:56:38 166.05 126
01JAN2012 ABC 9:56:40 166.00 100
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:43 166.00 1
01JAN2012 ABC 9:56:46 166.00 2
;
run;
proc sql;
create table want as
select *,log(price/(select avg(price) from (select price from have where date=a.date and co=a.co and time between a.time and a.time-5 group by co, date having time=min(time) ))) as return
from have as a;
quit;
Above code takes mean value if you have multiple "price" for the same "second", alternatively, you can also choose to use Max or Min price.
Haikuo
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.