Hi,
 
The short answer is there are different strategies, depending on what is exactly done with the data, and the required latency.
Some important questions: Why is the data kept for such long retention time? Is it for reference lookup? Pattern detection? Or rolling aggregations? I would guess it is for rolling aggregation. Then the next question is what is the step granularity of the larger rolling aggregations (7 and 30 days): is it per event or per day? If this is per day, the best would probably be to use cascading aggregation, using copy/aggregation and stateful/stateless sequences wisely so we then only keep in memory the events for the last day, and the aggregated values for the weeks and months. 
But of course, it also depends on the type of aggregation functions used. Do you require granularity at the event level or can you accommodate aggregating from aggregated levels?  
On the other end if there is no need for low latency using a persistent store like a fast database could be a good solution, but then there are more effective ways than a join for doing this. I would basically use a procedural window, except if we can accommodate a much higher latency (> a few seconds) and are not limited by the DB read/write asynchronicity. But we then also need to have more details about what data processing is required to define the best approach.
 
Hope this help
Fred