<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Event stream processing - Best practice in Streaming Analytics</title>
    <link>https://communities.sas.com/t5/Streaming-Analytics/Event-stream-processing-Best-practice/m-p/450577#M38</link>
    <description>&lt;P&gt;&lt;SPAN&gt;Hi,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The short answer is&amp;nbsp;there are different strategies, depending&amp;nbsp;on&amp;nbsp;what is exactly done with the data, and the required latency.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Some important questions: Why is the data kept for such long retention time? Is it for reference lookup? Pattern detection? Or rolling aggregations? I would guess it is for rolling aggregation. Then the next question is what is the step granularity of the larger rolling aggregations (7 and 30 days): is it per event or per day? If this is per day, the best would probably be to use cascading aggregation, using copy/aggregation and stateful/stateless sequences wisely so we then only keep in memory the events for the last day, and the aggregated values for the weeks and months. &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;But of course, it also depends on the type of aggregation functions used. Do you require granularity at the event level or can you accommodate aggregating from aggregated levels? &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;On the other end if there is no need for low latency using a persistent store like a fast database could be a good solution, but then there are more effective ways than a join for doing this. I would basically use a procedural window, except if we can accommodate a much higher latency (&amp;gt; a few seconds) and are not limited by the DB read/write asynchronicity. But we then also need to have more details about what data processing is required to define the best approach.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Hope this help&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Fred&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 03 Apr 2018 11:31:09 GMT</pubDate>
    <dc:creator>FredCombaneyre</dc:creator>
    <dc:date>2018-04-03T11:31:09Z</dc:date>
    <item>
      <title>Event stream processing - Best practice</title>
      <link>https://communities.sas.com/t5/Streaming-Analytics/Event-stream-processing-Best-practice/m-p/409887#M27</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;I have a problem with ESP flow implementation, which costs us a lot of RAM usage. As an input, I have a lot of raw records that are accumulating on daily, 7-days and 30-days level in realtime, and as a result, we have a huge amount of data. The doubt is whether to use stateful windows for these accumulated datasets or some persistent database (some in-memory DB). I would appreciate some best practices based on your experience with ESP, which methodology would suit the use case best? I am also considering that using persistent database between processing steps can help in failover, in case that the whole ESP is down.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;BR /&gt;Darko&lt;/P&gt;</description>
      <pubDate>Thu, 02 Nov 2017 15:41:34 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Streaming-Analytics/Event-stream-processing-Best-practice/m-p/409887#M27</guid>
      <dc:creator>darkomarjanovic</dc:creator>
      <dc:date>2017-11-02T15:41:34Z</dc:date>
    </item>
    <item>
      <title>Re: Event stream processing - Best practice</title>
      <link>https://communities.sas.com/t5/Streaming-Analytics/Event-stream-processing-Best-practice/m-p/450577#M38</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Hi,&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;The short answer is&amp;nbsp;there are different strategies, depending&amp;nbsp;on&amp;nbsp;what is exactly done with the data, and the required latency.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Some important questions: Why is the data kept for such long retention time? Is it for reference lookup? Pattern detection? Or rolling aggregations? I would guess it is for rolling aggregation. Then the next question is what is the step granularity of the larger rolling aggregations (7 and 30 days): is it per event or per day? If this is per day, the best would probably be to use cascading aggregation, using copy/aggregation and stateful/stateless sequences wisely so we then only keep in memory the events for the last day, and the aggregated values for the weeks and months. &lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;But of course, it also depends on the type of aggregation functions used. Do you require granularity at the event level or can you accommodate aggregating from aggregated levels? &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;On the other end if there is no need for low latency using a persistent store like a fast database could be a good solution, but then there are more effective ways than a join for doing this. I would basically use a procedural window, except if we can accommodate a much higher latency (&amp;gt; a few seconds) and are not limited by the DB read/write asynchronicity. But we then also need to have more details about what data processing is required to define the best approach.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Hope this help&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Fred&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Apr 2018 11:31:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/Streaming-Analytics/Event-stream-processing-Best-practice/m-p/450577#M38</guid>
      <dc:creator>FredCombaneyre</dc:creator>
      <dc:date>2018-04-03T11:31:09Z</dc:date>
    </item>
  </channel>
</rss>

