- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I appreciate any help in setting up a program in sas where I can measure price changes in an interval of 2 seconds using a large dataset. SQL is taking a lot of time. The thing is for each 2 seconds, I determine the price changes at any point of time (let's say pt), and then take the starting price of the interval, and the ending price of the interval (around the price changes), to compute the difference between the prices and compute some ratio out of these differences. The computation is done in an overlapping window of 2 seconds. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Please follow the guidance next to the post a question button. Post test data in the form of a datastep, and what the output should look like. Should straight forward to have a retained variable, which when > 00:02 difference flips over to 0 again, but need to see some data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your reply. I have attached a sample of my data. I have included a small sample of a given date between 8:47:00 and 10:00:00. "Changes" variable is when the price changes. Whenever price changes, I will take the price at the time of changes, let's say Pt.
I need to take two prices at different times of the same interval of 2 seconds: one second before price changes (StartPrice) and one second after price changes (EndPrice). I then compute three differences: |StartPrice-Pt| and |EndPrice-Pt| and |StartPrice-EndPrice|. These three values are used then to compute a certain ratio. This computation needs to be done in an overalpping window of 2 seconds, across several days. Please let me know if it is clear.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.
Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the clarification. I am new to this. Please find attached the txt file of sample data, as I don't have the SAS EU. Please let me know if this works. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here's what an example datastep of data would look like;
data have; informat date anydtdte. time time11.3 price changes best10. ; format date mmddyy10. time time11.3; input Date Time Price changes; dttme = dhms(date,0,0,time); format dttme datetime22.3; datalines; 15-OCT-2014 8:47:00.041 129.546875 . 15-OCT-2014 8:47:00.041 129.546875 0 15-OCT-2014 8:47:00.041 129.546875 0 15-OCT-2014 8:47:00.041 129.546875 0 15-OCT-2014 8:47:00.041 129.546875 0 15-OCT-2014 8:47:00.041 129.546875 0 15-OCT-2014 8:47:00.041 129.5625 0.015625 15-OCT-2014 8:47:00.048 129.546875 -0.015625 15-OCT-2014 8:47:00.048 129.546875 0 15-OCT-2014 8:47:00.048 129.546875 0 15-OCT-2014 8:47:00.144 129.546875 0 15-OCT-2014 8:47:00.160 129.546875 0 15-OCT-2014 8:47:00.164 129.546875 0 15-OCT-2014 8:47:00.169 129.546875 0 15-OCT-2014 8:47:00.184 129.546875 0 15-OCT-2014 8:47:00.184 129.546875 0 15-OCT-2014 8:47:00.184 129.546875 0 15-OCT-2014 8:47:00.371 129.546875 0 15-OCT-2014 8:47:00.371 129.53125 -0.015625 15-OCT-2014 8:47:00.371 129.546875 0.015625 15-OCT-2014 8:47:00.371 129.546875 0 15-OCT-2014 8:47:00.910 129.546875 0 15-OCT-2014 8:47:00.910 129.546875 0 15-OCT-2014 8:47:00.910 129.546875 0 15-OCT-2014 8:47:00.910 129.546875 0 15-OCT-2014 8:47:00.913 129.546875 0 15-OCT-2014 8:47:00.913 129.546875 0 15-OCT-2014 8:47:00.913 129.546875 0 15-OCT-2014 8:47:00.913 129.546875 0 15-OCT-2014 8:47:00.920 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.017 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.020 129.546875 0 15-OCT-2014 8:47:01.029 129.546875 0 15-OCT-2014 8:47:01.032 129.546875 0 15-OCT-2014 8:47:01.200 129.546875 0 15-OCT-2014 8:47:01.200 129.53125 -0.015625 15-OCT-2014 8:47:01.986 129.546875 0.015625 15-OCT-2014 8:47:01.986 129.546875 0 15-OCT-2014 8:47:01.986 129.546875 0 15-OCT-2014 8:47:01.986 129.546875 0 15-OCT-2014 8:47:01.993 129.546875 0 15-OCT-2014 8:47:01.993 129.546875 0 15-OCT-2014 8:47:01.993 129.546875 0 15-OCT-2014 8:47:01.993 129.546875 0 15-OCT-2014 8:47:01.997 129.546875 0 15-OCT-2014 8:47:01.997 129.546875 0 15-OCT-2014 8:47:01.997 129.546875 0 15-OCT-2014 8:47:01.997 129.546875 0 15-OCT-2014 8:47:01.997 129.546875 0 15-OCT-2014 8:47:01.997 129.546875 0 15-OCT-2014 8:47:01.997 129.546875 0 15-OCT-2014 8:47:02.046 129.546875 0 15-OCT-2014 8:47:02.046 129.546875 0 15-OCT-2014 8:47:02.046 129.546875 0 15-OCT-2014 8:47:02.046 129.546875 0 15-OCT-2014 8:47:02.325 129.546875 0 15-OCT-2014 8:47:02.325 129.546875 0 15-OCT-2014 8:47:02.329 129.53125 -0.015625 15-OCT-2014 8:47:02.329 129.546875 0.015625 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 15-OCT-2014 8:47:02.552 129.546875 0 ; run;
I added a datetime variable to hold the date and time in one as when your data crosses midnight it is much easier to use one variable, the dttme, than all the coding to go around midnight to get intervals.
So hopefully you can show us 1) how/what you want to indicate as the start of an interval, 2) using the values above indicate the appropriate range for that start interval, and 3) manually calculate some of the values you re indicating.
Since your example data
15-OCT-2014 8:47:00.371 129.546875 0
15-OCT-2014 8:47:00.371 129.53125 -0.015625
15-OCT-2014 8:47:00.371 129.546875 0.015625
15-OCT-2014 8:47:00.371 129.546875 0
shows for time values identical to one thousandth of a second you may have and issue with tie breaking as the cumulative change shown for that time point is 0. I would guess that the program capturing the data is not prepared to report on the actual time or the output field is set up to round.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Many thanks for the clarification and the explanation. I appreciate this. Please note that this data is in millisecond, and the changes in price are not cumulative. We migt have frequently observations at quite similar time. Everytime prices change, I consider this as an event. I then need to determine the StartPrice in the one-second before the price change and the EndPrice in the one-second following the price change. Since the time is in millisecond, I rarely have exactly 1 second before and after observations. So I need to consider the last price within one-second interval period around the price change. For example, the price change at 8:47:01:200 occurs when the price "pt" was 129.53125. For this price change, the StartPrice is determined within the 1-second interval before, i.e. 129.546875 around 8:47:00:200 (8:47:00.184) and the EndPrice is determined one-second after, i.e. 129.456875 around 8:47:02:200 (8:47:02.046).
For every price change I need to determine this in an overlapping window. I then compute three differences: D1=abs(StartPrice-Pt); D2=abs(EndPrice-Pt); D3=abs(StratPrice-EndPrice). Many thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you are only looking at your "event" then perhaps a first step is reducing the data to mostly events such as with:
data intermediate; set have; changedif = dif(changes); if changedif ne 0; drop changedif; run;
Since your data is so long you might use a reduced set such as above to actually show more that a couple seconds and then the by-hand calculations.
One concern is when you say "overlapping" intervals that could mean that you need to create a bunch of variables but without a more concrete example what those variables actual results may look like isn't clear, at least to me.
If you don't use SAS datetime variables much you many not know that they are number (and fractions) of seconds. So there are functions and simple arithmetic to deal with intervals.
The concern about your repeated time values is when you have 2 or more values with the same timestamp how do you pick which one is the StartPrice and Endprice? They will all end up at the same interval difference.
Consider (using very nice values) that you have a price of 10 at time 8:00:00.000 and you have 3 values of price, 9.5,10.1 and 9.8 at time 7:59:59.000. Which of the values at the 7:59:59 time is the start price?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Many thanks for the swift response. At many occasions, I have price changes consecutively and randomly. I cannot do the analysis by regular intervals [1,2], [3,4], but by overlapping windows [1,3], [2,4].
In your example, my start price would be 9.5 - the first price submitted at that time (this is usual way of doing things in the field); the endprice will be the last one submitted at that time. There is no significant changes in prices, as shown in the data. Sometimes, StartPrice is equal to the EndPrice, as prices reverse quickly. Please let me know if this is clearer. Thanks.