Dear all,
I hope you are doing well despite the pandemic.
The content of the question is copied from https://communities.sas.com/t5/SAS-Programming/Computing-10-second-price-difference/td-p/716368 and then minor adjustments have been made.
I have the following dataset:
Time Volume Price Date
10:02:04 100 100.25 10.01.2021
10:02:07 200 100.35 10.01.2021
10:02:14 300 100.55 10.01.2021
10:02:16 100 100.60 10.01.2021
10:02:16 300 100.70 10.01.2021
10:02:23 200 100.80 10.01.2021
10:02:04 100 100.25 11.01.2021
10:02:07 200 100.35 11.01.2021
10:02:14 300 100.55 11.01.2021
10:02:16 100 100.60 11.01.2021
10:02:16 300 100.70 11.01.2021
10:02:23 200 100.80 11.01.2021
And here is what I want:
Time Volume Price Date Diff
10:02:04 100 100.25 10.01.2021 -0.0029 = (100.25 - 100.55)/100.25 = (price(t)-price(t+10))/price(t)
10:02:07 200 100.35 10.01.2021 -0.0032 = (100.35 - 100.675)/100.35
10:02:14 300 100.55 10.01.2021 -0.0012 = (100.55 - 100.675)/100.55
10:02:16 100 100.60 10.01.2021 . (because there is not observation within 10 seconds window)
10:02:16 300 100.70 10.01.2021
10:02:33 200 100.80 10.01.2021
The same computations for the next day
In short, I need to find price change over 10 seconds windows with some modifications:
1. .If I do not have p(t+10), then I need to select the price that is closest to p(t+10). If I do not have any value within 10 second windows then, I should get ".".
2. If I have two prices in one second, I need to compute the volume-weighted average price for this second and then use it. For example, in my data, I have an interval of 10:02:07. I do not have a price at 10:02:17 and the closest one is the price at 10:02:16. However, in this interval, I have two prices. Therefore, I am computing the volume-weighted average price for this interval (((100/400)*100.60 + (300/400)*100.70) = 100.675) and then use this one.
3. I need to do same computation for each day. Cross days are not allowed. So, each day will have a new start. I have separate date and time variable. Date format is DDMMYYN8. and time format is TOD.
Thank you!
A few tweaks to my original program gives you what you want.
data have;
input Time :time8. Volume Price Date :ddmmyy10.;
format Time time8. Date ddmmyy10.;
datalines;
10:02:04 100 100.25 10.01.2021
10:02:07 200 100.35 10.01.2021
10:02:14 300 100.55 10.01.2021
10:02:16 100 100.60 10.01.2021
10:02:16 300 100.70 10.01.2021
10:02:33 200 100.80 10.01.2021
10:02:04 100 100.25 11.01.2021
10:02:07 200 100.35 11.01.2021
10:02:14 300 100.55 11.01.2021
10:02:16 100 100.60 11.01.2021
10:02:16 300 100.70 11.01.2021
10:02:23 200 100.80 11.01.2021
;
data want(keep = Time Volume Price dif);
if _N_ = 1 then do;
dcl hash h(multidata : "Y", ordered : "Y");
h.definekey("d", "t");
h.definedata("d", "t", "v", "p");
h.definedone();
dcl hiter hi("h");
dcl hash hh();
hh.definekey("d", "t");
hh.definedata("s");
hh.definedone();
do until (z);
set have end = z;
h.add(key : date, key : time, data : date, data : time, data : Volume, data : Price);
if hh.find(key : date, key : Time) ne 0 then s = Volume;
else s + Volume;
hh.replace(key : date, key : Time, data : s);
end;
end;
set have;
d = .; t = .; v = .; p = .; s = .;
if h.find(key : Date, key : Time + 10) = 0 then mt = t;
else do;
rc = hi.setcur(key : Date, key : Time);
do i = 1 by 1 while(hi.next() = 0 & Date = d);
if t - time > 10 then do;
if i = 1 then mt = .;
leave;
end;
if t > time then mt = t;
end;
end;
rc = hh.find(key: date, key : mt);
rc = h.reset_dup();
do while (h.do_over(key : date, key : mt) = 0);
pp = sum(pp, divide(v, s) * p);
end;
dif = (Price - pp) / Price;
format dif 8.4;
run;
Result:
Time Volume Price dif 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0012 10:02:16 100 100.60 . 10:02:16 300 100.70 . 10:02:33 200 100.80 . 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0025 10:02:16 100 100.60 -0.0020 10:02:16 300 100.70 -0.0010 10:02:23 200 100.80 .
A few tweaks to my original program gives you what you want.
data have;
input Time :time8. Volume Price Date :ddmmyy10.;
format Time time8. Date ddmmyy10.;
datalines;
10:02:04 100 100.25 10.01.2021
10:02:07 200 100.35 10.01.2021
10:02:14 300 100.55 10.01.2021
10:02:16 100 100.60 10.01.2021
10:02:16 300 100.70 10.01.2021
10:02:33 200 100.80 10.01.2021
10:02:04 100 100.25 11.01.2021
10:02:07 200 100.35 11.01.2021
10:02:14 300 100.55 11.01.2021
10:02:16 100 100.60 11.01.2021
10:02:16 300 100.70 11.01.2021
10:02:23 200 100.80 11.01.2021
;
data want(keep = Time Volume Price dif);
if _N_ = 1 then do;
dcl hash h(multidata : "Y", ordered : "Y");
h.definekey("d", "t");
h.definedata("d", "t", "v", "p");
h.definedone();
dcl hiter hi("h");
dcl hash hh();
hh.definekey("d", "t");
hh.definedata("s");
hh.definedone();
do until (z);
set have end = z;
h.add(key : date, key : time, data : date, data : time, data : Volume, data : Price);
if hh.find(key : date, key : Time) ne 0 then s = Volume;
else s + Volume;
hh.replace(key : date, key : Time, data : s);
end;
end;
set have;
d = .; t = .; v = .; p = .; s = .;
if h.find(key : Date, key : Time + 10) = 0 then mt = t;
else do;
rc = hi.setcur(key : Date, key : Time);
do i = 1 by 1 while(hi.next() = 0 & Date = d);
if t - time > 10 then do;
if i = 1 then mt = .;
leave;
end;
if t > time then mt = t;
end;
end;
rc = hh.find(key: date, key : mt);
rc = h.reset_dup();
do while (h.do_over(key : date, key : mt) = 0);
pp = sum(pp, divide(v, s) * p);
end;
dif = (Price - pp) / Price;
format dif 8.4;
run;
Result:
Time Volume Price dif 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0012 10:02:16 100 100.60 . 10:02:16 300 100.70 . 10:02:33 200 100.80 . 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0025 10:02:16 100 100.60 -0.0020 10:02:16 300 100.70 -0.0010 10:02:23 200 100.80 .
data have; input Time :time8. Volume Price Date :ddmmyy10.; format Time time8. Date ddmmyy10.; datetime=dhms(date,0,0,time); datalines; 10:02:04 100 100.25 10.01.2021 10:02:07 200 100.35 10.01.2021 10:02:14 300 100.55 10.01.2021 10:02:16 100 100.60 10.01.2021 10:02:16 300 100.70 10.01.2021 10:02:33 200 100.80 10.01.2021 10:02:04 100 100.25 11.01.2021 10:02:07 200 100.35 11.01.2021 10:02:14 300 100.55 11.01.2021 10:02:16 100 100.60 11.01.2021 10:02:16 300 100.70 11.01.2021 10:02:23 200 100.80 11.01.2021 ; proc summary data=have nway; class dateTime ; var Price/weight=Volume; output out=x(drop=_:) mean=; run;proc sql; create table want as select *, (price-(select price from x where datetime between a.datetime+1 and a.datetime+10 having datetime=max(datetime))) /price as diff format = 8.4 from have as a; quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.