I am trying to find the return over a 5 minute period by using proc expand in SAS. I am working on financial time series data. The time data is in hr:min:sec. Sample data is given below:
date Co time price vol
01JAN2012 ABC 9:55:59 166.00 100
01JAN2012 ABC 9:56:02 165.40 50
01JAN2012 ABC 9:56:02 165.40 25
01JAN2012 ABC 9:56:03 166.45 700
01JAN2012 ABC 9:56:03 166.80 20
01JAN2012 ABC 9:56:03 167.00 280
01JAN2012 ABC 9:56:06 166.45 1
01JAN2012 ABC 9:56:06 166.05 24
01JAN2012 ABC 9:56:15 166.50 500
01JAN2012 ABC 9:56:16 166.50 25
01JAN2012 ABC 9:56:17 166.50 475
01JAN2012 ABC 9:56:22 166.25 10
01JAN2012 ABC 9:56:26 166.05 50
01JAN2012 ABC 9:56:30 166.25 20
01JAN2012 ABC 9:56:38 166.05 126
01JAN2012 ABC 9:56:40 166.00 100
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:43 166.00 1
01JAN2012 ABC 9:56:46 166.00 2
But the output throws up 'time' as 0,1,2,3.... when I run the following code:
data temp;set alldata_sas7bdat; proc expand data=temp out=lagtst method=none; convert time=time_lag1 / transform=(lag 1); convert price=price_lag1 / transform=(lag 1); by date; run;
proc print data=lagtst (obs=5000); run;
proc delete data=temp; data lagtst_2;set lagtst; timegap = time-time_lag1; if timegap > 0 and timegap < 5 then return=log(price/price_lag1); else return=.; run;
While I have tried to calculate return over a '5 minute' period, the data I have posted here is not enough. I can provide more data but I wonder if it would be too cumbersome here. So please suggest a method for finding out return of a '5 second' period. I will then make changes to find return over a 5-minute period. Thanks.
You can also achieve it by using sql. Here is for 5-seconds period .
data have; input date :date9. Co $ time : time9. price vol ; format date date9. time time.; cards; 01JAN2012 ABC 9:55:59 166.00 100 01JAN2012 ABC 9:56:02 165.40 50 01JAN2012 ABC 9:56:02 165.40 25 01JAN2012 ABC 9:56:03 166.45 700 01JAN2012 ABC 9:56:03 166.80 20 01JAN2012 ABC 9:56:03 167.00 280 01JAN2012 ABC 9:56:06 166.45 1 01JAN2012 ABC 9:56:06 166.05 24 01JAN2012 ABC 9:56:15 166.50 500 01JAN2012 ABC 9:56:16 166.50 25 01JAN2012 ABC 9:56:17 166.50 475 01JAN2012 ABC 9:56:22 166.25 10 01JAN2012 ABC 9:56:26 166.05 50 01JAN2012 ABC 9:56:30 166.25 20 01JAN2012 ABC 9:56:38 166.05 126 01JAN2012 ABC 9:56:40 166.00 100 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:43 166.00 1 01JAN2012 ABC 9:56:46 166.00 2 ; run; proc sql; create table want as select *,log(price/(select price from have where date=a.date and time+5=a.time)) as return from have as a; quit;
Xia Keshan
Thank you very much..
However, the program gives a blank 'return'.
date stk time price vol return
01JAN2003 ABC 9:55:59 166.00 100 .
I have used the program giving the path to the datafiles.
libname in 'c:\Users\VLM\Stk data';
data temp;set in.ABC_sas7bdat;
run;
proc sql;
create table want as
select *,log(price/(select price from temp where date=temp.date and time+5=temp.time)) as return
from temp;
quit;
That is because there is no price at that time . haikuo bian has already pointed out it .
I make a TEMP dataset to hold every second price value.
data have; input date :date9. Co $ time : time9. price vol ; format date date9. time time.; cards; 01JAN2012 ABC 9:55:59 166.00 100 01JAN2012 ABC 9:56:02 165.40 50 01JAN2012 ABC 9:56:02 165.40 25 01JAN2012 ABC 9:56:03 166.45 700 01JAN2012 ABC 9:56:03 166.80 20 01JAN2012 ABC 9:56:03 167.00 280 01JAN2012 ABC 9:56:06 166.45 1 01JAN2012 ABC 9:56:06 166.05 24 01JAN2012 ABC 9:56:15 166.50 500 01JAN2012 ABC 9:56:16 166.50 25 01JAN2012 ABC 9:56:17 166.50 475 01JAN2012 ABC 9:56:22 166.25 10 01JAN2012 ABC 9:56:26 166.05 50 01JAN2012 ABC 9:56:30 166.25 20 01JAN2012 ABC 9:56:38 166.05 126 01JAN2012 ABC 9:56:40 166.00 100 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:40 166.00 1 01JAN2012 ABC 9:56:43 166.00 1 01JAN2012 ABC 9:56:46 166.00 2 ; run; data temp; merge have have(firstobs=2 keep=date time rename=(date=_date time=_time)); output; if date eq _date then do; do i=time+1 to _time-1; time=i;output; end; end; run; proc sql; create table want as select *,log(price/(select price from temp where date=a.date and time+5=a.time)) as return from have as a; quit;
Xia Keshan
's code assumes that you will have one unique data entry per second, obviously it is not the case, but the logic of his code still holds:
data have;
input date :date9. Co $ time : time9. price vol;
format date date9. time time.;
cards;
01JAN2012 ABC 9:55:59 166.00 100
01JAN2012 ABC 9:56:02 165.40 50
01JAN2012 ABC 9:56:02 165.40 25
01JAN2012 ABC 9:56:03 166.45 700
01JAN2012 ABC 9:56:03 166.80 20
01JAN2012 ABC 9:56:03 167.00 280
01JAN2012 ABC 9:56:06 166.45 1
01JAN2012 ABC 9:56:06 166.05 24
01JAN2012 ABC 9:56:15 166.50 500
01JAN2012 ABC 9:56:16 166.50 25
01JAN2012 ABC 9:56:17 166.50 475
01JAN2012 ABC 9:56:22 166.25 10
01JAN2012 ABC 9:56:26 166.05 50
01JAN2012 ABC 9:56:30 166.25 20
01JAN2012 ABC 9:56:38 166.05 126
01JAN2012 ABC 9:56:40 166.00 100
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:40 166.00 1
01JAN2012 ABC 9:56:43 166.00 1
01JAN2012 ABC 9:56:46 166.00 2
;
run;
proc sql;
create table want as
select *,log(price/(select avg(price) from (select price from have where date=a.date and co=a.co and time between a.time and a.time-5 group by co, date having time=min(time) ))) as return
from have as a;
quit;
Above code takes mean value if you have multiple "price" for the same "second", alternatively, you can also choose to use Max or Min price.
Haikuo
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.