Data cleaning is often the biggest bottleneck in modern AI workflows, consuming the majority of time before modeling even begins.
This session introduces an autonomous approach that pairs SAS Viya’s statistical profiling with LLM-based reasoning to diagnose data quality issues and recommend structured, auditable cleaning actions. The design separates detection from reasoning to improve privacy and reliability, logs every recommended action for traceability, and uses validation (including human-in-the-loop) to keep outputs trustworthy and production-ready. This practical walkthrough shows how structured JSON decisions can drive repeatable execution and clearer reporting on what was cleaned, what remains unresolved, and why.
Presenter: Urvi Mehta, University of Michigan
Watch the recording