LLM Custom Step Generator in SAS Studio
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Imagine a custom step in a SAS Studio flow that generates another custom step to handle specific tasks. A custom step is a file, therefore why not use a large language model (LLM) like GPT-4o from Azure OpenAI to generate one? Today, we’ll explore how to leverage this technology to create custom steps for data processing, using Python or SAS logic.
Overview: The LLM Custom Step Generator
The LLM - Custom Step Generator is a custom step that uses Azure OpenAI's GPT-4o to create fully functional custom steps. These steps can be used in SAS Studio flow to perform specific tasks, such as data anonymization, merging tables, or generating detailed documentation.
The custom step is in the process of being published to the SAS Studio Custom Steps Public GitHub Repository. We will publish an update when it's done. Thank you for your patience.
The process is simple:
- Define the logic of the custom step through a detailed prompt.
- Provide the required environment and configuration files to establish a connection with your Azure OpenAI service and ground the model.
- Run the generator to output the custom step code.
The result? A fully functional custom step tailored to your specific requirements.
Watch the video demonstration to find out more.
- Chapters
- descriptions off, selected
- captions settings, opens captions settings dialog
- captions off, selected
This is a modal window.
Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
This is a modal window. This modal can be closed by pressing the Escape key or activating the close button.
Example 1: Custom Step Anonymizing Personal Data in a Text File
You have a text file containing sensitive personal information, such as names, email addresses, and physical addresses. Your goal is to anonymize this data using a custom step.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Steps to Generate the Custom Step
- Define the Inputs:
- Prompt: Write a detailed description of the custom step logic. For instance:
Create a custom step that reads an input file containing PIIs, identifies the data containing PIIs and anonymizes it. 1. input is a csv file input.csv. 2. output is output.csv. 3. the logic is written in Python. Provide the Prompt UI, the program and the full step file.
- .env File: This file contains the environment variables needed to access the Azure OpenAI API.
Example:
AZURE_OAI_ENDPOINT='https://my_endpoint.openai.azure.com/' # change my_endpoint
AZURE_OAI_KEY='my_api_key' # change my_api_key
AZURE_OAI_DEPLOYMENT='gpt-4o'
- System Message File: A file providing context to the LLM about what a custom step is, along with examples and guidelines. This is really the “secret sauce in the recipe”. In a nutshell it describes what a custom step is, provides a few Prompt UI, Programs written in SAS and Python and tweaks the output with important instructions.
- Define the Output Location: Specify where the generated step code should be saved.
- Run the Generator: Execute the LLM - Custom Step Generator. The process takes about 15–30 seconds, depending on the complexity of the prompt.
- Review the Output:
- The generator produces a file containing the custom step code.
- This file includes the Python logic for anonymizing personal data, along with a prompt UI for configuring inputs and outputs.
- Test the Custom Step:
- Save the generated code as a .step file.
-
- Add the custom step to a workflow, select the input text file, and specify the output file.
- Run the workflow and verify the results.
The custom step successfully anonymized the text file by replacing sensitive information with hashed values or other anonymized strings. The process demonstrated the accuracy of the LLM in generating Python logic based on the provided instructions.
Example 2: Classical Data Management with SAS Logic
You need to perform classic data management tasks, such as merging two tables, calculating the top-selling product per month, and creating a summary table of top products.
Steps to Generate the Custom Step
- Define the Inputs:
-
- Prompt: Provide a detailed description of the logic:
Create a custom step using SAS logic. The step has two table inputs, for example SASDM.PRDSAL2 and SASDM.PRDSAL3. The logic will merge the two tables. Then it will summarize the product sales by YEAR, MONTH, PRODUCT and sum up the ACTUAL sales. It will then create another data set NATIONAL_SALES in SASDM listing by YEAR, MONTH create a new column CHAMPION_PRODUCT equal with the top selling product.
-
- Use the same .env and system message files as in the first example.
- Run the Generator:
Execute the generator with the updated prompt.
- Review the Output:
-
- The generator produces a .step file containing the SAS logic for the specified tasks.
- The step includes input configurations for the source tables and an output configuration for the result table.
- Test the Custom Step:
- Add the generated step to a workflow.
- Configure the input and output tables.
- Run the workflow and verify the results.
- Results
The custom step successfully merged the tables, calculated the top-selling products, and created the summary table. The SAS logic was accurate and aligned with the prompt instructions.
Key Components
To replicate these examples, ensure you have the following:
- Azure OpenAI Resource: Create an Azure OpenAI resource and deploy a model like GPT-4o.
- Environment Configuration (.env File): Include the necessary API endpoint, key, and deployment name in a .env file.
- System Message File: Provide detailed instructions and examples to guide the LLM in generating the custom step.
- Python Dependencies: Install python-dotenv and requests to handle environment variables and API requests.
Why Use the LLM Custom Step Generator?
- Efficiency: Automates the creation of custom steps, saving time and effort.
- Flexibility: Supports multiple programming languages, including Python and SAS.
- Accuracy: Leverages the power of GPT-4o to generate precise and (in most of the instances) functional code.
- Scalability: Can be used to generate a wide range of custom steps for various tasks.
Where to Find the Custom Step
The custom step is in the process of being published to the SAS Studio Custom Steps Public GitHub Repository. We will publish an update when it's done. Thank you for your patience.
Conclusion
The ability to generate custom steps using Azure OpenAI GPT-4o opens up new possibilities for data management, automation, and innovation. Whether you’re anonymizing data, managing tables, or documenting workflows, the LLM - Custom Step Generator provides a powerful and flexible solution.
The examples presented here are just the beginning—what custom steps will you create next?
Thank you for exploring this exciting technology with me. Stay tuned for more innovations in Data Management with SAS Viya.
Thank you for your time reading this post. If you liked the post, give it a thumbs up! Please comment and tell us what you think about the LLM custom step generator. If you wish to get more information, please write me an email.
Find more articles from SAS Global Enablement and Learning here.