<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Collating/summarizing observations over a time interval in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290896#M59809</link>
    <description>They should be summed up to the next full second</description>
    <pubDate>Thu, 11 Aug 2016 07:19:39 GMT</pubDate>
    <dc:creator>MaBo1011</dc:creator>
    <dc:date>2016-08-11T07:19:39Z</dc:date>
    <item>
      <title>Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290886#M59807</link>
      <description>&lt;P&gt;Hi there,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;have a set of trading data that has observations (timed in milliseconds from midnight) but infrequent. To make it a reasonable timeseries I did a do loop&amp;nbsp;to build a grid (one observation per second should be the outcome). Now I want my dataset to fit into the grid (summing up the volumes that happened in that second. I am very thankful for any ideas. See exampel and faulty code below.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Grid:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;Seconds &lt;BR /&gt;3300100&lt;BR /&gt;3300200&lt;BR /&gt;3300300&lt;BR /&gt;.&lt;BR /&gt;.&lt;BR /&gt;
&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Data example:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;ID date miliseconds volume
ABC 01/01/15 3300214 100
ABC 01/01/15 3300300 200
ABC 01/01/15 3300415 300
ABC 01/01/15 4400122 400
ABC 01/01/15 4500022 200
ABC 01/01/15 4500111 400&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Output:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;ID date seconds volume
ABC 01/01/15 3300100 100
ABC 01/01/15 3300200 500
ABC 01/01/15 3300300 400
ABC 01/01/15 3300400 600&lt;BR /&gt;&lt;BR /&gt;(&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;SPAN&gt;time is miliseconds from midnught (just a numerical example that is not correct in terms of summing up)&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;data grid;
do seconds = '09:10't to '16:50't ;
    output ;
  end ;
run;

 proc sql ;
  create table want as
  select a.id, a.date, a.miliseconds, sum(a.volume) as Volume
  from have a
     left join
       grid1 b  on a.id = b.id
                    and a.date = b.date
                    and a.miliseconds = floor(b.seconds)
  group by a.id, a.date, a.miliseconds ;
quit;
run;&lt;/CODE&gt;&amp;nbsp;&amp;nbsp;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Unfortunately does that not work the way I want. Anything I'm missing?&amp;nbsp;&lt;SPAN&gt;So, ID and date couldn't be found, so I tried to create them in the data step. But then ID&amp;nbsp;isn't the same format, which I tried to fix but failed. It seems I have an error when merging the have dataset to the want by matching the miliseconds between the seconds gridlines. &amp;nbsp;Always thankful for input! Best&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Aug 2016 06:38:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290886#M59807</guid>
      <dc:creator>MaBo1011</dc:creator>
      <dc:date>2016-08-11T06:38:05Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290895#M59808</link>
      <description>&lt;PRE&gt;
I don't understand why these obs should be the same group ?

ID date miliseconds volume    GROUP
ABC 01/01/15 3300214 100    1
ABC 01/01/15 3300300 200    2
ABC 01/01/15 3300415 300    2
ABC 01/01/15 4400122 400    3
ABC 01/01/15 4500022 200    4
ABC 01/01/15 4500111 400    4

&lt;/PRE&gt;</description>
      <pubDate>Thu, 11 Aug 2016 07:17:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290895#M59808</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2016-08-11T07:17:39Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290896#M59809</link>
      <description>They should be summed up to the next full second</description>
      <pubDate>Thu, 11 Aug 2016 07:19:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290896#M59809</guid>
      <dc:creator>MaBo1011</dc:creator>
      <dc:date>2016-08-11T07:19:39Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290898#M59810</link>
      <description>&lt;P&gt;Are you sure you're talking milliseconds here, an not "centiseconds"?&lt;/P&gt;
&lt;P&gt;My solution:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id $ date:mmddyy10. milliseconds volume;
next_full_sec = int((milliseconds-1) / 1000) + 1;
format date mmddyy10. next_full_sec time8.;
cards;
ABC 01/01/15 3300214 100
ABC 01/01/15 3300300 200
ABC 01/01/15 3300415 300
ABC 01/01/15 4400122 400
ABC 01/01/15 4500022 200
ABC 01/01/15 4500111 400
;
run;

proc summary data=have;
by id date next_full_sec;
var volume;
output out=want (drop=_type_) sum=;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 11 Aug 2016 07:34:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290898#M59810</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2016-08-11T07:34:04Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290906#M59811</link>
      <description>&lt;P&gt;Hi Kurt, thank you for the code. It seems to work but it doesn't show the seconds without trades (which should show a 0). Also, I really mean miliseconds from midnight compressed in seconds from midnight, thus it should start at 33000 (seconds from midnight, which is 9:10am).&lt;/P&gt;&lt;P&gt;The rest of your code looks really useful and thank you really much so far!&lt;/P&gt;</description>
      <pubDate>Thu, 11 Aug 2016 07:58:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290906#M59811</guid>
      <dc:creator>MaBo1011</dc:creator>
      <dc:date>2016-08-11T07:58:29Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290907#M59812</link>
      <description>&lt;P&gt;Then your values are not milliseconds; keep in mind that one second has 1000 milliseconds, so the value for 09:00:00 would be&lt;/P&gt;
&lt;P&gt;9 * 3600 * 1000, which evaluates to 32400000. I guess you are missing a zero at the end of your example data's milliseconds. Or they really are centiseconds.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Aug 2016 08:05:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290907#M59812</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2016-08-11T08:05:15Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290908#M59813</link>
      <description>You're right. Still, I want my given millisecond values be summed up onto the full second grid (starting 33000).</description>
      <pubDate>Thu, 11 Aug 2016 08:07:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290908#M59813</guid>
      <dc:creator>MaBo1011</dc:creator>
      <dc:date>2016-08-11T08:07:10Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290931#M59820</link>
      <description>&lt;P&gt;This would be my solution to include the grid:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input id $ date:mmddyy10. milliseconds volume;
next_full_sec = int((milliseconds-1) / 1000) + 1;
format date mmddyy10. next_full_sec time8.;
cards;
ABC 01/01/15 33002140 100
ABC 01/01/15 33003000 200
ABC 01/01/15 33004150 300
ABC 01/01/15 44001220 400
ABC 01/01/15 45000220 200
ABC 01/01/15 45001110 400
;
run;

data grid;
do seconds = 1 to 86400;
  output;
end;
run;

proc summary data=have;
by id date next_full_sec;
var volume;
output out=int (drop=_type_) sum=;
run;
&lt;BR /&gt;/* create a reference table for all IDs, dates, and seconds */
proc sql;
create table seconds as
select
  distinct id, date, seconds
  from have, grid
  order by id, date, seconds
;
quit;

data want;
merge
  seconds (in=a rename=(seconds=next_full_sec))
  int (in=b)
;
by id date next_full_sec;
if a;
if not b
then do;
  _freq_ = 0;
  volume = 0;
end;
run;

proc print noobs;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 11 Aug 2016 10:10:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/290931#M59820</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2016-08-11T10:10:28Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/291094#M59839</link>
      <description>&lt;P&gt;Kurt, thank you so much for your help so far and it looks like it's nearly done. Now I just get the error:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="line-height: 20px;"&gt;641 proc sql;&lt;BR /&gt;642 create table data.seconds as&lt;BR /&gt;643 select id, date, seconds&lt;BR /&gt;644 from data.test, data.grid&lt;BR /&gt;645 order by id, date, seconds&lt;BR /&gt;646 ;&lt;BR /&gt;NOTE: The execution of this query involves performing one or more Cartesian product joins that can not be optimized.&lt;BR /&gt;ERROR: User asked for termination.&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;647 quit;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It's the only thing left to solve I guess. I had to terminate the step after ca. 30min as it wouldn't work apparantly (the data I have is not that big...)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;PS: Yes I googled the Error message but SAS help just mentions to&amp;nbsp;&lt;SPAN&gt;apply the second maintenance release&amp;nbsp;for SAS 9.2. I have 9.4 though&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Aug 2016 22:59:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/291094#M59839</guid>
      <dc:creator>MaBo1011</dc:creator>
      <dc:date>2016-08-11T22:59:02Z</dc:date>
    </item>
    <item>
      <title>Re: Collating/summarizing observations over a time interval</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/291152#M59864</link>
      <description>&lt;P&gt;Hi!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First of all, you missed the "distinct" in the select. This means that the output dataset would have obs(test) * obs(grid), which probably blows up your space in DATA.&lt;/P&gt;
&lt;P&gt;If your datasets are quite large, I recommend to do two steps:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sort
  data=data.test (keep=id date)
  out=data.id_date
  nodupkey
;
by id date;
run;

proc sql;
create table data.seconds as
select id, date, seconds
from data.id_date, data.grid
order by id, date, seconds
;
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Now the distinct is not necessary, as that has been dealt with in the first step.&lt;/P&gt;
&lt;P&gt;The NOTE about the cartesian joins is expected; it's exactly what we want to do.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Aug 2016 06:23:30 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Collating-summarizing-observations-over-a-time-interval/m-p/291152#M59864</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2016-08-12T06:23:30Z</dc:date>
    </item>
  </channel>
</rss>

