It happens almost daily on the Belgian railway network: a train that has to slow down because an animal is too close to the tracks, or because vegetation has grown too close to the line. Situations that may seem trivial at first glance, but in reality trigger an entire chain of actions, from the initial notification to the full restoration of normal traffic flow. At Infrabel, the infrastructure manager of the Belgian rail system, such incidents are carefully recorded in incident reports that describe the context, cause, timeline and actions taken. These reports contain valuable information, but the insights they hold are not immediately accessible in their raw textual form.
At the same time, Infrabel holds a wealth of structured operational data covering punctuality, train composition, routes, delays and technical parameters. From this data arise essential operational and policy questions, such as “How punctual trains were at Brussels South in 2023” or “Which delays had the greatest impact on the evening peak”. Answering such operational and policy questions requires bringing together information about incidents, delays and network conditions in a consistent and timely way. Yet obtaining such insights often depended on manual analysis and specialised expertise, making it difficult to respond quickly and confidently when new information was needed.
From this dual reality, a joint innovation project between Infrabel and SAS emerged: exploring how both data streams, unstructured text and structured operational data, could be better leveraged using modern AI techniques and transparent governance. The goal was not to replace existing processes, but to make insights faster, more consistent and more accessible.
As part of this exploration, a prototype was developed in which unstructured incident reports were automatically processed. Through a combination of NLP techniques and generative AI, elements such as incident type, the duration of the restriction, involved objects or conditions and locations were converted into a structured format. What once existed only within narrative text became analysable as a dataset. On top of this, a semantic search layer was built using RAG, enabling questions about these incidents to be answered with objective data enriched with relevant context from the original reports. This makes it possible to gain clearer insight into questions such as which incident types occur most frequently, how long infrastructure restrictions typically remain in place and which locations are more sensitive to specific types of events. These insights help teams make better informed decisions, ranging from daily operations to longer term strategic improvements.
In parallel, the project explored how questions about structured operational data could be made more accessible. Whereas employees previously depended on manually written queries, an experimental workflow was tested in which an AI assistant converts natural language questions into a reproducible and verifiable query of the underlying datasets. As a result, employees without technical query skills can still obtain direct answers to operational or policy‑oriented questions. Questions about punctuality, delay impacts and broader operational trends can now be answered not only faster but also in a consistent way. Every step, from interpreting the question to producing the final dataset, remains transparent and repeatable. This creates low‑threshold access to insights that were previously time‑consuming or required specialist knowledge.
At the same time, the project highlighted the need for a deliberate way to coordinate the different AI components. In this context, deterministic agents were chosen: mechanisms that guide each step of the process in a predictable and transparent manner, always with a human in the loop to safeguard domain expertise and reliability. To ensure everything functions coherently, an orchestration mechanism is required to keep the interaction between models, data sources and interpretative logic consistent and traceable. In this setup, the SAS Retrieval Agent Manager acts as the coordinating layer, ensuring that these components operate together in a controlled and auditable way. For certain types of challenges, more autonomous agents may be appropriate as well, but in this case the controlled approach offered the right balance between efficiency, stability and trust.
This project shows how new analytical techniques deliver real value when they align seamlessly with existing decision making processes. The balance between innovation and operational control is crucial, especially in an environment where reliability and safety are paramount. When technology is applied in the right way, it becomes a powerful lever to unlock knowledge faster, reveal patterns more clearly and support teams in their daily work.
To learn more about Infrabel's use case, check out their Hacker's Hub profile.
Nearly 200 sessions are now available on demand in the Innovate Hub.
Watch Now →Latest updates from the SAS Hackathon Desk.
Looking for inspiration? Check out:
• Past SAS Hackathon Team Profiles.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Join us at the 2025 SAS Hackathon Sept. 15 – Oct 10. Visit the SAS Hackathon homepage.
Check it out!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.