Dear all,
I hope you are doing well despite the pandemic.
The content of the question is copied from https://communities.sas.com/t5/SAS-Programming/Computing-10-second-price-difference/td-p/716368 and then minor adjustments have been made.
I have the following dataset:
Time Volume Price Date
10:02:04 100 100.25 10.01.2021
10:02:07 200 100.35 10.01.2021
10:02:14 300 100.55 10.01.2021
10:02:16 100 100.60 10.01.2021
10:02:16 300 100.70 10.01.2021
10:02:23 200 100.80 10.01.2021
10:02:04 100 100.25 11.01.2021
10:02:07 200 100.35 11.01.2021
10:02:14 300 100.55 11.01.2021
10:02:16 100 100.60 11.01.2021
10:02:16 300 100.70 11.01.2021
10:02:23 200 100.80 11.01.2021
And here is what I want:
Time Volume Price Date Diff
10:02:04 100 100.25 10.01.2021 -0.0029 = (100.25 - 100.55)/100.25 = (price(t)-price(t+10))/price(t)
10:02:07 200 100.35 10.01.2021 -0.0032 = (100.35 - 100.675)/100.35
10:02:14 300 100.55 10.01.2021 -0.0012 = (100.55 - 100.675)/100.55
10:02:16 100 100.60 10.01.2021 . (because there is not observation within 10 seconds window)
10:02:16 300 100.70 10.01.2021
10:02:33 200 100.80 10.01.2021
The same computations for the next day
In short, I need to find price change over 10 seconds windows with some modifications:
1. .If I do not have p(t+10), then I need to select the price that is closest to p(t+10). If I do not have any value within 10 second windows then, I should get ".".
2. If I have two prices in one second, I need to compute the volume-weighted average price for this second and then use it. For example, in my data, I have an interval of 10:02:07. I do not have a price at 10:02:17 and the closest one is the price at 10:02:16. However, in this interval, I have two prices. Therefore, I am computing the volume-weighted average price for this interval (((100/400)*100.60 + (300/400)*100.70) = 100.675) and then use this one.
3. I need to do same computation for each day. Cross days are not allowed. So, each day will have a new start. I have separate date and time variable. Date format is DDMMYYN8. and time format is TOD.
Thank you!
A few tweaks to my original program gives you what you want.
data have;
input Time :time8. Volume Price Date :ddmmyy10.;
format Time time8. Date ddmmyy10.;
datalines;
10:02:04 100 100.25 10.01.2021
10:02:07 200 100.35 10.01.2021
10:02:14 300 100.55 10.01.2021
10:02:16 100 100.60 10.01.2021
10:02:16 300 100.70 10.01.2021
10:02:33 200 100.80 10.01.2021
10:02:04 100 100.25 11.01.2021
10:02:07 200 100.35 11.01.2021
10:02:14 300 100.55 11.01.2021
10:02:16 100 100.60 11.01.2021
10:02:16 300 100.70 11.01.2021
10:02:23 200 100.80 11.01.2021
;
data want(keep = Time Volume Price dif);
if _N_ = 1 then do;
dcl hash h(multidata : "Y", ordered : "Y");
h.definekey("d", "t");
h.definedata("d", "t", "v", "p");
h.definedone();
dcl hiter hi("h");
dcl hash hh();
hh.definekey("d", "t");
hh.definedata("s");
hh.definedone();
do until (z);
set have end = z;
h.add(key : date, key : time, data : date, data : time, data : Volume, data : Price);
if hh.find(key : date, key : Time) ne 0 then s = Volume;
else s + Volume;
hh.replace(key : date, key : Time, data : s);
end;
end;
set have;
d = .; t = .; v = .; p = .; s = .;
if h.find(key : Date, key : Time + 10) = 0 then mt = t;
else do;
rc = hi.setcur(key : Date, key : Time);
do i = 1 by 1 while(hi.next() = 0 & Date = d);
if t - time > 10 then do;
if i = 1 then mt = .;
leave;
end;
if t > time then mt = t;
end;
end;
rc = hh.find(key: date, key : mt);
rc = h.reset_dup();
do while (h.do_over(key : date, key : mt) = 0);
pp = sum(pp, divide(v, s) * p);
end;
dif = (Price - pp) / Price;
format dif 8.4;
run;
Result:
Time Volume Price dif 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0012 10:02:16 100 100.60 . 10:02:16 300 100.70 . 10:02:33 200 100.80 . 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0025 10:02:16 100 100.60 -0.0020 10:02:16 300 100.70 -0.0010 10:02:23 200 100.80 .
A few tweaks to my original program gives you what you want.
data have;
input Time :time8. Volume Price Date :ddmmyy10.;
format Time time8. Date ddmmyy10.;
datalines;
10:02:04 100 100.25 10.01.2021
10:02:07 200 100.35 10.01.2021
10:02:14 300 100.55 10.01.2021
10:02:16 100 100.60 10.01.2021
10:02:16 300 100.70 10.01.2021
10:02:33 200 100.80 10.01.2021
10:02:04 100 100.25 11.01.2021
10:02:07 200 100.35 11.01.2021
10:02:14 300 100.55 11.01.2021
10:02:16 100 100.60 11.01.2021
10:02:16 300 100.70 11.01.2021
10:02:23 200 100.80 11.01.2021
;
data want(keep = Time Volume Price dif);
if _N_ = 1 then do;
dcl hash h(multidata : "Y", ordered : "Y");
h.definekey("d", "t");
h.definedata("d", "t", "v", "p");
h.definedone();
dcl hiter hi("h");
dcl hash hh();
hh.definekey("d", "t");
hh.definedata("s");
hh.definedone();
do until (z);
set have end = z;
h.add(key : date, key : time, data : date, data : time, data : Volume, data : Price);
if hh.find(key : date, key : Time) ne 0 then s = Volume;
else s + Volume;
hh.replace(key : date, key : Time, data : s);
end;
end;
set have;
d = .; t = .; v = .; p = .; s = .;
if h.find(key : Date, key : Time + 10) = 0 then mt = t;
else do;
rc = hi.setcur(key : Date, key : Time);
do i = 1 by 1 while(hi.next() = 0 & Date = d);
if t - time > 10 then do;
if i = 1 then mt = .;
leave;
end;
if t > time then mt = t;
end;
end;
rc = hh.find(key: date, key : mt);
rc = h.reset_dup();
do while (h.do_over(key : date, key : mt) = 0);
pp = sum(pp, divide(v, s) * p);
end;
dif = (Price - pp) / Price;
format dif 8.4;
run;
Result:
Time Volume Price dif 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0012 10:02:16 100 100.60 . 10:02:16 300 100.70 . 10:02:33 200 100.80 . 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0025 10:02:16 100 100.60 -0.0020 10:02:16 300 100.70 -0.0010 10:02:23 200 100.80 .
data have; input Time :time8. Volume Price Date :ddmmyy10.; format Time time8. Date ddmmyy10.; datetime=dhms(date,0,0,time); datalines; 10:02:04 100 100.25 10.01.2021 10:02:07 200 100.35 10.01.2021 10:02:14 300 100.55 10.01.2021 10:02:16 100 100.60 10.01.2021 10:02:16 300 100.70 10.01.2021 10:02:33 200 100.80 10.01.2021 10:02:04 100 100.25 11.01.2021 10:02:07 200 100.35 11.01.2021 10:02:14 300 100.55 11.01.2021 10:02:16 100 100.60 11.01.2021 10:02:16 300 100.70 11.01.2021 10:02:23 200 100.80 11.01.2021 ; proc summary data=have nway; class dateTime ; var Price/weight=Volume; output out=x(drop=_:) mean=; run;proc sql; create table want as select *, (price-(select price from x where datetime between a.datetime+1 and a.datetime+10 having datetime=max(datetime))) /price as diff format = 8.4 from have as a; quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.