Hello everyone,
My team is looking for proven methods to optimize and refactor a large volume of legacy SAS code that is causing performance issues on our platform. We've found that a purely manual approach is too time-consuming, and our initial attempt to create an LLM-based SAS "Copilot" for a fully automated solution didn't succeed due to LLM's not really matured to optimize SAS code as yet.
We're now exploring and hoping we can find an AI tools to assist in this process. Has anyone in the SAS community successfully used an AI-driven tool or workflow to:
Analyze and identify inefficient SAS code patterns at scale?
Generate optimized or refactored code from legacy SAS scripts?
Validate that the output of the new code matches the original?
We're not looking for a magic button, but for a practical, effective method that leverages AI to make the refactoring process more efficient. Any insights, examples, or recommendations for specific tools would be incredibly helpful.
Thank you!
Is performance the only issue you are trying to address here or are there other objectives?
Have you explored optimising your SAS installation settings as well? This can really give you great bang for buck. As an example, turning on SAS dataset compression by default may improve IO a lot, reduce disk usage, and hence reduce program run times without any coding changes at all.
I'd also be prepared to bet that only a small percentage of your programs are the really badly performing ones. If you haven't done so already, you need to do SAS log analysis to identify these. You may find that your code optimisation only needs to focus on say 10% of the worst performing programs - those that run for hours and chew up CPU, IO and memory. Only comprehensive SAS log analysis will identify these. AI code analysis wont identify these.
You might even find that by focussing on the worst of the worst, that a combination of manual and AI-generated improvements does the trick.
Assuming you're running whole flows with multiple programs, the traditional approach would be to determine the critical path, identify the longest running program on this critical path and in there the longest running steps. You can often get quick wins just by tweaking these long running steps.
Run your programs with option fullstimer which provides additional info that's important for choosing a good performance tuning approach.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.