Suppose that I have the following dataset:
Time Volume Price
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:18 100 100.60
10:02:18 300 100.70
10:02:23 200 100.80
And, I want to get the following dataset
Time Volume Price Diff
10:02:04 100 100.25 -0.0045 = (100.25 - 100.70)/100.25 = (price(t)-price(t+10))/price(t)
10:02:07 200 100.35 -0.0032 = (100.35 - 100.675)/100.35
10:02:14 300 100.55 -0.0012 = (100.55 - 100.675)/100.55
10:02:16 100 100.60 -0.0009 = (100.60 - 100.70)/100.60
10:02:16 300 100.70
So, as you can see, I need to find price change over 10 seconds (p(t) - p(t+10)) and then divide it by p(t). However, there are some issues. First, if I do not have p(t+10), then I need to select the price that is closest to p(t+10). Furthermore, if I have two prices in one second, I need to compute the volume-weighted average price for this second and then use it. For example, in my data, I have an interval of 10:02:07. I do not have a price at 10:02:17 and the closest one is the price at 10:02:16. However, in this interval, I have two prices. Therefore, I am computing the volume-weighted average price for this interval (((100/400)*100.60 + (300/400)*100.70) = 100.675) and then use this one.
Could you please help me with this?
I hope I could explain my question.
@Khaladdin If you go for the SQL approach, you have to read from have instead of x like this
proc sql;
create table want as
select *,
(price-(select price from x where time between a.time and a.time+10 having time=max(time)))/price as diff format = 8.4
from have as a;
quit;
Is time an actual SAS Time value?
Yes.
Also, why is it 100.70 in the first obs: (100.25 - 100.70)/100.25 ?
Shouldn't that be (100.25 - 100.55)/100.25 ?
Yes, sorry. It should be 100.55
Also. Is your data grouped by some ID variable?
No, I do not have ID in my data.
Try this
data have;
input Time :time8. Volume Price;
format Time time8.;
datalines;
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:16 100 100.60
10:02:16 300 100.70
10:02:23 200 100.80
;
data want(keep = Time Volume Price dif);
if _N_ = 1 then do;
dcl hash h(multidata : "Y");
h.definekey("t");
h.definedata("t", "v", "p");
h.definedone();
dcl hiter i("h");
dcl hash hh();
hh.definekey("t");
hh.definedata("t", "s");
hh.definedone();
do until (z);
set have end = z;
h.add(key : time, data : time, data : Volume, data : Price);
if hh.find(key : Time) ne 0 then s = Volume;
else s + Volume;
hh.replace(key : Time, data : Time, data : s);
end;
end;
set have;
t = .; v = .; p = .; s = .;
aa = constant('big');
if h.find(key : Time + 10) = 0 then dif = (Price - p) / Price;
else do;
do while (i.next() = 0);
if abs((time + 10) - t) < aa then do;
mt = t;
aa = abs((time + 10) - t);
end;
end;
rc = hh.find(key : mt);
do while (h.do_over(key : mt) = 0);
pp = sum(pp, divide(v, s) * p);
end;
dif = (Price - pp) / Price;
end;
format dif 8.4 t mt time8.;
run;
Result:
Time Volume Price dif 10:02:04 100 100.25 -0.0030 10:02:07 200 100.35 -0.0032 10:02:14 300 100.55 -0.0025 10:02:16 100 100.60 -0.0020 10:02:16 300 100.70 -0.0010 10:02:23 200 100.80 0.0000
It works well in the existing, however, when I extend the data, I have some problems. Let's say, we have the following data:
data have;
input Time :time8. Volume Price;
format Time time8.;
datalines;
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:16 100 100.60
10:02:16 300 100.70
10:02:23 200 100.80
10:02:35 100 100.90
10:02:47 300 101.20
10:02:53 200 101.40
10:03:11 100 101.60
10:03:33 300 101.70
10:03:37 200 101.80
10:03:45 100 101.60
10:03:53 300 101.70
10:04:32 200 102.80
;
When I use this method, at 10:03:11, I am getting 0. Normally, it should be (101.60 - 101.70)/101.60. Furthermore, at time 10:03:53, I should get (101.70 - 102.80)/101.70. However, I am getting 0 with this code.
Ok. So we should not be restricted to 'closest within' 10 seconds, correct?
Yes.
Ok. Just made a small correction.
See if this works
data have;
input Time :time8. Volume Price;
format Time time8.;
datalines;
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:16 100 100.60
10:02:16 300 100.70
10:02:23 200 100.80
10:02:35 100 100.90
10:02:47 300 101.20
10:02:53 200 101.40
10:03:11 100 101.60
10:03:33 300 101.70
10:03:37 200 101.80
10:03:45 100 101.60
10:03:53 300 101.70
10:04:32 200 102.80
;
data want(keep = Time Volume Price dif);
if _N_ = 1 then do;
dcl hash h(multidata : "Y");
h.definekey("t");
h.definedata("t", "v", "p");
h.definedone();
dcl hiter i("h");
dcl hash hh();
hh.definekey("t");
hh.definedata("t", "s");
hh.definedone();
do until (z);
set have end = z;
h.add(key : time, data : time, data : Volume, data : Price);
if hh.find(key : Time) ne 0 then s = Volume;
else s + Volume;
hh.replace(key : Time, data : Time, data : s);
end;
end;
set have;
t = .; v = .; p = .; s = .;
aa = constant('big');
if h.find(key : Time + 10) = 0 then dif = (Price - p) / Price;
else do;
do while (i.next() = 0);
if abs((time + 10) - t) < aa and Time ne t then do;
mt = t;
aa = abs((time + 10) - t);
end;
end;
rc = hh.find(key : mt);
do while (h.do_over(key : mt) = 0);
pp = sum(pp, divide(v, s) * p);
end;
dif = (Price - pp) / Price;
end;
format dif 8.4 t mt time8.;
run;
Really sorry. Still small issue. Normally, if there is a value within 10 seconds, then this value should be used first. However, if no value is available within 10 seconds, then the next one should be used (closest from outside 10 seconds).
data have; input Time :time8. Volume Price; format Time time8.; datalines; 10:02:04 100 100.25 10:02:07 200 100.35 10:02:14 300 100.55 10:02:16 100 100.60 10:02:16 300 100.70 10:02:23 200 100.80 ; proc summary data=have nway; class Time ; var Price/weight=Volume; output out=x(drop=_:) mean=; run; proc sql; create table want as select *, (price-(select price from x where time between a.time and a.time+10 having time=max(time)))/price as diff from x as a; quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.