BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Khaladdin
Quartz | Level 8

Suppose that I have the following dataset:

 

Time                          Volume                 Price

10:02:04                     100                     100.25

10:02:07                     200                     100.35

10:02:14                     300                     100.55

10:02:18                     100                     100.60

10:02:18                     300                     100.70

10:02:23                     200                     100.80

 

And, I want to get the following dataset

 

Time                          Volume          Price            Diff

10:02:04                     100              100.25          -0.0045 = (100.25 - 100.70)/100.25 = (price(t)-price(t+10))/price(t)

10:02:07                     200              100.35          -0.0032 = (100.35 - 100.675)/100.35

10:02:14                     300              100.55          -0.0012 = (100.55 - 100.675)/100.55

10:02:16                     100              100.60          -0.0009 = (100.60 - 100.70)/100.60

10:02:16                     300              100.70

 

So, as you can see, I need to find price change over 10 seconds (p(t) - p(t+10)) and then divide it by p(t). However, there are some issues. First, if I do not have p(t+10), then I need to select the price that is closest to p(t+10). Furthermore, if I have two prices in one second, I need to compute the volume-weighted average price for this second and then use it. For example, in my data, I have an interval of 10:02:07. I do not have a price at 10:02:17 and the closest one is the price at 10:02:16. However, in this interval, I have two prices. Therefore, I am computing the volume-weighted average price for this interval (((100/400)*100.60 + (300/400)*100.70) = 100.675) and then use this one. 

 

Could you please help me with this?

 

I hope I could explain my question. 

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

@Khaladdin If you go for the SQL approach, you have to read from have instead of x like this

 

proc sql;
create table want as
 select *,
  (price-(select price from x where time between a.time and a.time+10 having time=max(time)))/price as diff format = 8.4
  from have as a;
quit;

View solution in original post

36 REPLIES 36
PeterClemmensen
Tourmaline | Level 20

Is time an actual SAS Time value?

PeterClemmensen
Tourmaline | Level 20

Also, why is it 100.70 in the first obs: (100.25 - 100.70)/100.25 ?

 

Shouldn't that be (100.25 - 100.55)/100.25 ?

Khaladdin
Quartz | Level 8

Yes, sorry. It should be 100.55

PeterClemmensen
Tourmaline | Level 20

Also. Is your data grouped by some ID variable?

Khaladdin
Quartz | Level 8

No, I do not have ID in my data.

Khaladdin
Quartz | Level 8
In the first observation, it should be (100.25 - 100.55)/100.25
PeterClemmensen
Tourmaline | Level 20

Try this

 

data have;
input Time :time8. Volume Price;
format Time time8.;
datalines;
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:16 100 100.60
10:02:16 300 100.70
10:02:23 200 100.80
;

data want(keep = Time Volume Price dif);
   if _N_ = 1 then do;
      dcl hash h(multidata : "Y");
      h.definekey("t");
      h.definedata("t", "v", "p");
      h.definedone();
      dcl hiter i("h");

      dcl hash hh();
      hh.definekey("t");
      hh.definedata("t", "s");
      hh.definedone();

      do until (z);
         set have end = z;
         h.add(key : time, data : time, data : Volume, data : Price);
         if hh.find(key : Time) ne 0 then s = Volume;
         else                   s + Volume;
         hh.replace(key : Time, data : Time, data : s);
      end;
   end;

   set have;
   t = .; v = .; p = .; s = .;
   aa = constant('big');

   if h.find(key : Time + 10) = 0 then dif = (Price - p) / Price;

   else do;
      do while (i.next() = 0);
         if abs((time + 10) - t) < aa then do;
            mt = t;
            aa = abs((time + 10) - t);
         end;
      end;

      rc = hh.find(key : mt);

      do while (h.do_over(key : mt) = 0);
         pp = sum(pp, divide(v, s) * p);
      end;

      dif = (Price - pp) / Price;
   end;

   format dif 8.4 t mt time8.;

run;

 

Result:

 

Time      Volume  Price   dif 
10:02:04  100     100.25  -0.0030 
10:02:07  200     100.35  -0.0032 
10:02:14  300     100.55  -0.0025 
10:02:16  100     100.60  -0.0020 
10:02:16  300     100.70  -0.0010 
10:02:23  200     100.80   0.0000 
Khaladdin
Quartz | Level 8

It works well in the existing, however, when I extend the data, I have some problems. Let's say, we have the following data:

 

data have;
input Time :time8. Volume Price;
format Time time8.;
datalines;
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:16 100 100.60
10:02:16 300 100.70
10:02:23 200 100.80
10:02:35 100 100.90
10:02:47 300 101.20
10:02:53 200 101.40
10:03:11 100 101.60
10:03:33 300 101.70
10:03:37 200 101.80
10:03:45 100 101.60
10:03:53 300 101.70
10:04:32 200 102.80
;

 

When I use this method, at 10:03:11, I am getting 0. Normally, it should be (101.60 - 101.70)/101.60. Furthermore, at time 10:03:53, I should get (101.70 - 102.80)/101.70. However, I am getting 0 with this code.

PeterClemmensen
Tourmaline | Level 20

Ok. So we should not be restricted to 'closest within' 10 seconds, correct?

PeterClemmensen
Tourmaline | Level 20

Ok. Just made a small correction. 

 

See if this works

 

data have;
input Time :time8. Volume Price;
format Time time8.;
datalines;
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:16 100 100.60
10:02:16 300 100.70
10:02:23 200 100.80
10:02:35 100 100.90
10:02:47 300 101.20
10:02:53 200 101.40
10:03:11 100 101.60
10:03:33 300 101.70
10:03:37 200 101.80
10:03:45 100 101.60
10:03:53 300 101.70
10:04:32 200 102.80
;

data want(keep = Time Volume Price dif);
   if _N_ = 1 then do;
      dcl hash h(multidata : "Y");
      h.definekey("t");
      h.definedata("t", "v", "p");
      h.definedone();
      dcl hiter i("h");

      dcl hash hh();
      hh.definekey("t");
      hh.definedata("t", "s");
      hh.definedone();

      do until (z);
         set have end = z;
         h.add(key : time, data : time, data : Volume, data : Price);
         if hh.find(key : Time) ne 0 then s = Volume;
         else                   s + Volume;
         hh.replace(key : Time, data : Time, data : s);
      end;
   end;

   set have;
   t = .; v = .; p = .; s = .;
   aa = constant('big');

   if h.find(key : Time + 10) = 0 then dif = (Price - p) / Price;

   else do;
      do while (i.next() = 0);
         if abs((time + 10) - t) < aa and Time ne t then do;
            mt = t;
            aa = abs((time + 10) - t);
         end;
      end;

      rc = hh.find(key : mt);

      do while (h.do_over(key : mt) = 0);
         pp = sum(pp, divide(v, s) * p);
      end;

      dif = (Price - pp) / Price;
   end;

   format dif 8.4 t mt time8.;

run;
Khaladdin
Quartz | Level 8

Really sorry. Still small issue. Normally, if there is a value within 10 seconds, then this value should be used first. However, if no value is available within 10 seconds, then the next one should be used (closest from outside 10 seconds).

Ksharp
Super User
data have;
input Time :time8. Volume Price;
format Time time8.;
datalines;
10:02:04 100 100.25
10:02:07 200 100.35
10:02:14 300 100.55
10:02:16 100 100.60
10:02:16 300 100.70
10:02:23 200 100.80
;

proc summary data=have nway;
class Time ;
var Price/weight=Volume;
output out=x(drop=_:) mean=;
run;
proc sql;
create table want as
 select *,
  (price-(select price from x where time between a.time and a.time+10 having time=max(time)))/price as diff
  from x as a;
quit;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 36 replies
  • 1733 views
  • 9 likes
  • 4 in conversation