AI Code/Data Integration Assistant in Real Time and post process analy...

Anacreo · ‎01-03-2024

Microsoft does have its co-pilot, and there is IntelliSense (code suggestions as you code), but I think there can be something more profound.

Let's call it SAS Guru.

As you are coding in the various interfaces, SAS EG or SAS Studio I think you could score the code (maybe even using some statistics!) and when you find code that seems to be not efficient (such as repetition of terms, profoundly short variable names, lack of comments, lack of coding structures, long procedural style code, etc) you can send badly scored pieces of code in real time to a generative AI sidebar that can make real time contextual suggestions for code enhancements.

"SAS Guru: It appears that this function is doing an average payment calculation consider renaming variable "c" to "cAvgPmt" for clarity."

"SAS Guru: This code could be rewritten to enhance maintainability by creating variables for these values, and rewriting it with a %do loop, see this rewritten code suggestion...

%do ...

"

Similarly, when leveraging a data access method or doing a proc sql / datastep you can send the query / datastep to AI to suggest enhancements. When you do this though you can send in real time the underlying table configuration, etc, to give the best possible answer, ie. knowing the data types, knowing the database interface, knowing the table structure, understanding context of those structures (to suggest in db analytics) and knowing what is indexed. (things that a non-integrated AI wouldn't be able to inherently glean) In the same vein it can know when you're merging data between data sources and suggest more optimal ways of doing it.

When code is running you could use AI to give an analysis of the log results or the overall code.

(Build information about iterative run and present back that AI generated analysis)

"SAS Guru: This run was faster than your last run, on December 3rd at 6am by 18 minutes, that's 10% faster, the performance of all successful runs to date is between 1 hour and 1.5 hours. Your fastest runs occurred between 5pm and 6pm, and your slowest runs occurred in the mornings.

Your overall program took 45 minutes to run the 4th procedure (PROC SQL - ...) took the longest time (40m), it brought in 100 columns and 1m records, however your program only uses 35 of these columns and 20,000 rows.
(using AI to suggest a resolution)
First Consider Using mod(account_number,50)=0 in the WHERE close to build a sample set without importing all the data.

Next Consider Updating the select with these (...used columns...) will drop your data import by approximately 70%.

Consider moving your 6th procedure into the "whatever database" using this suggested code rewrite, this will save 6 minutes and reduce your data import by 20%.

I'm a performance guy you could definitely use this to suggest parameters to functions, or better functions, etc.

If using AI is a stretch, doing some post analysis/suggestion based on the log (or iterative logs) would be a really nice enhancement.

Patrick · ‎01-03-2024

Based on this discussion it appears SAS is already working on such an AI assistant.
https://communities.sas.com/t5/SAS-Programming/Seeking-Advice-on-AI-Programming-Assistants-for-SAS/m...

Anacreo · ‎01-04-2024

Perhaps this post will provide Generative AI use case that hadn’t been thought of, I see Chris’s post brilliantly provides a rewrite based on the question, and the code that's been shared (and knowledge the LM has). The experience I had that was more profound was when I used a SQL editor that gave suggestions that blew me away. When I peaked at the code what it was essentially doing was building a narrative around what it knew about my active connection to the ChatGPT engine to give WAY more context to what I would have posed to ChatGPT.

So, where someone might write "rewrite this code EXAMPLE CODE to do something" the editor I saw was doing something more like... "<IDE>On an Oracle 6.5.1 system with parallelism 8 with table structures <the whole 9 yards> (including example data if you toggled that on)</IDE> rewrite this code EXAMPLE to do something." In the background the code I saw (open source project) even did a pre-generative AI "what convention of variable naming is mostly being used in (give all the variables names defined)" and then modifying the generative AI to ask, "with variables in <CamelCase> style".

It did so many of these simple but smart things that only an integrated IDE could that really impressed me, you'd be amazed at how much crisper the results are given more details and minutia.

The other thing that struck me is it was using a scoring mechanism to real time determine what sections were not maintainable and given the suggestions around that. If your query didn't seem offensive it said nothing, but if your query irked its maintainability score it would simply ask ChatGPT in the background "how to rewrite <CODE> to be more maintainable." When the IDE was taking these active measures to improve my code that's when I found the experience more profound. There are statistical functions to score the complexity of code and other maintainability metrics.

The Data interface suggestions to show how to optimize the data is something that would require SAS to describe to a generative AI something that may not be apparent in just a simple code analysis or may require SAS to build/leverage engine specific generative AI. Maybe not in the current scope of work.

The post analytic log would be a different type of AI to give suggestions about how their code performance has improved, or to show when performance is an anomaly. We often get tickets about why something is so slow, and it’s that they’re running on the 3rd of the month and their code is always slow on the 3rd of the month, they just never noticed it. Or they worked late at night and the system had no constraints and that’s why it was fast. It would help to cure a lot of tickets we deal with.

I believe it's no mystery that s/he who conquers AI first will win this century.

AI Code/Data Integration Assistant in Real Time and post process analysis (SAS Guru)