All data has a story to tell, but without understanding the data, it can be difficult to uncover the essence of that story. Imagine having a lakeful of data that you only have very limited knowledge of and would like to learn more through data exploration.
If you have already been exposed to SAS Viya you know that SAS Viya platform offers a multitude of capabilities for data exploration and data governance.
The spearhead tool for data exploration is SAS Information Catalog. It is an intuitive tool for learning a lot more about your data in a quick glance. The Catalog plays a key role within the SAS Viya platform designed to help users discover, understand, and manage data assets more efficiently. It sports features such as automated data discovery, AI-powered metadata enrichment, centralized data catalog, data profiling and quality insights and provides integration with SAS and other data development tools.
While it’s crucial to gain both numerical and visual information about data, modern data users are accustomed to also getting a written, descriptive representation of data. SAS Information Catalog automatically formulates a textual description of the data and its main attributes in the summary presentation:
Fortunately, SAS Viya platform is rich with APIs that expose the platform functionality to authorized external calls. The APIs cover the Information Catalog and can be triggered by a variety of protocols and programming languages, like HTTP or Python, and including of course SAS’ own PROC HTTP.
Here’s an example of a PROC HTTP REST call to the search API of the Information Catalog and it uses the /catalog/search endpoint of the Catalog API. First, let’s define the file location for storing the REST API output:
filename respc "&location/get_cat_src.json";
filename resphdrc "&location/get_cat_src.txt";
%let BASE_URI = %sysfunc(getoption(servicesbaseurl));
%put NOTE: &=BASE_URI;
Then execute the actual PROC HTTP call with variable &SRC denoting the search term:
proc http url="&BASE_URI/catalog/search?q=&SRC&index=catalog&limit=50"
method='get'
out=respc
headerout=resphdrc
headerout_overwrite verbose;
headers 'Authorization'="Bearer &ACCESS_TOKEN" ;
run;
Finally, create a libname for the respc json file to access it like a regular SAS table:
libname cat1 json "%sysfunc(pathname(respc))";
Once the json file has been parsed the result will look something according to this:
Score denotes the match level of the result row with the given search term.
For further reference the Catalog APIs are documented here.
I believe that APIs are super powerful in making different tools speak to each other. But what if you weren’t limited to the exact query syntax that APIs require. Humans are ultimately more comfortable speaking in natural language, and many would appreciate a more flexible way to query data, conversationally, like we would do with each other. Bogdan Teleuca explains this in his blog Conversing with Data: Turning Queries into Conversations with SAS Viya, Azure OpenAI and LangChain.
Text-to-text interaction, powered by Large Language Models (LLMs), has shown tremendous capabilities in handling unstructured data. When it comes to structured data, LangChain enters the fray, making it a breeze to interact with data. The amalgamation of SAS Viya, Azure OpenAI, LangChain, and Python programming is opening new dimensions in data interaction. It's not just about querying anymore; it's about conversing, transforming the way we engage with our data.
But what does this have to do with data exploration and finding your data assets in a simple and reliable way? It is the application of the LLM models to one or several agents that make the AI magic happen. As it happens, Mr. Bogdan Teleuca has also written a blog on how to combine LLM agents to translate the natural language queries into API query syntax that SAS Viya and especially the Information Catalog can understand. In his blog post Using an LLM to Query Catalogued Assets Bogdan describes the process and also compares the internal search capability of the Information Catalog to the one of an intelligent LLM agent, specifically GPT-4 from Azure OpenAI.
Modern LLM’s provide help with that by examining the data and offering a conversational way to query the data. The key question is how to securely combine the power of an LLM into a data exploration tool. This linkage would need to be able to utilize the powerful cloud-based LLM models and still be secure, in the way not to divulge the data into public domain. For this we could employ intelligent agents to handle our data conversation. Now I’m just letting my vision fly, but to cover this scenario there could be three types of agents:
The identification agent first decides if the query is categorized as retrieval or conversational. Retrieval queries help to formulate the human language into syntax that is understood by the Information Catalog search API. They can be used to parametrize the data searches that normally take place in the traditional query input of SAS Information Catalog. An example would be:
“Find me all resources that have the word “CARS” in the table name AND contain at least 428 rows of data”
Conversational queries interact with the user to answer further questions on specific data. For example:
“(Ok, you found the data…) Now tell me can the Model column be used as a unique identifier and is it missing any values?”
Looking forward to the near future this is something that SAS Data Management R&D is currently investigating. Having a data exploration tool that allows conversational data exploration in a reliable and secure manner provides value for users to understand and locate their data. Having this power at your fingertips would definitely help to uncover the beautiful story that is hidden in your data!
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.