BookmarkSubscribeRSS Feed

Download Metadata and Metrics with SAS Information Catalog APIs from Python

Started ‎02-25-2024 by
Modified ‎02-25-2024 by
Views 1,110

Navigating through a sea of data assets can be a daunting task. SAS Information Catalog is your navigator in this journey, allowing you to discover, search, and manage your SAS Viya assets efficiently. When a data asset is discovered, hundreds of metrics are calculated. Imagine having the ability to access these metrics through a simple command, downloading them using a Catalog API. This opens up the possibility of using these rich metrics in custom reports or flows to answer pressing questions.

 

Metadata Levels

 

When utilizing the SAS Information Catalog REST API to download metadata for data assets, tables, or files, you have several options to choose from. The metadata can be of various types, such as:

 

  • dataDictionary: provides basic column-level metadata, such as name and type.
  • dataDictionaryAndProfile: offers extensive metadata that combines both data dictionary and profile metrics.
  • detailedMetrics: gives comprehensive metadata at a column level, including patterns, frequency distribution values.

 

The middle level, dataDictionaryAndProfile, strikes an excellent balance between richness and complexity, making it the ideal choice if you need to identify private data, semantic type or classification at a column level.

 

Download Metadata

 

Recorded Demonstration

 

For a more visual understanding of the process, you can watch the following recorded demonstration:

 

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Get a SAS Viya Access Token

 

For information on obtaining a SAS Viya access token, refer to the previous post Discover Your Data with SAS Information Catalog APIs from Python – Access.

 

Python Program

 

The Python program download_metadata.py retrieves and processes metadata from a SAS Viya Information Catalog, using REST API:

 

  • Imports necessary packages.
  • Constructs Catalog API URL for the request.
  • Retrieves a saved access token.
  • Sends the Catalog API request.
  • Saves the response to a CSV file.

 

Here’s the complete code:

 

# 1 Packages
import sys
import requests
import json
import os

# 2. Arguments
print ("Number of arguments:", len(sys.argv), "arguments")
print ("Argument List:", str(sys.argv) + '\n')
baseURL=str(sys.argv[1])
search_query=str(sys.argv[2])
pem_path=str(sys.argv[3])

# 3. Construct Variables
print ("REST API Inputs:\n")
print('\nYour SAS VIYA host is ', baseURL)
print('\nCatalog API URL: ')
print(f'{baseURL}/catalog/instances' + search_query)

url = f'{baseURL}/catalog/instances' + search_query

print('\nDownload metadata specified in the URL: ' + url + '\n')

"""
# URL Examples: Select one of the following - passed as a parameter

# Download profile Metrics filtered by name with prefix
# url = f"{baseURL}/catalog/instances/?filter=startsWith(name,'WATER')&?filter=contains(type,cas)&level=dataDictionaryAndProfile&limit=10"

# Download detailedMetrics filtered by name with prefix
# url = f"{baseURL}/catalog/instances/?filter=contains(type,cas)&level=detailedMetrics&limit=10"

# Download profile Metrics filtered by name with prefix
# url = f"{baseURL}/catalog/instances/?filter=contains(type,cas)&level=dataDictionary&prefix=simpledownload&limit=100"

"""

# 4. Get the Saved Access Token
print('\nRetrieving the saved token from api/access_token.txt\n')
with open("api/access_token.txt", "r", encoding="UTF-8") as f:
    token = f.read()
#print(token)

headers = {'Accept': 'text/csv','Authorization': 'Bearer ' + token}

# Create folder
directory = 'api'
if not os.path.exists("api/"):
    os.mkdir("api/")

# Save in catalog_download.csv file utf-8
## stream the API requests
with open(directory + '/catalog_download.csv', 'w', encoding="utf-8") as f:
    with requests.get(url, headers=headers, stream=True, verify=pem_path) as response:
        f.write(response.text)
f.close()

print('\nThe Catalog Downloaded Metadata was stored for you as ' + directory + '/catalog_download.csv\n')

 

This program demonstrates how to use the SAS Viya REST API to perform a metadata download. It provides a basic example and can be used as a starting point for building more advanced functionality.

 

Run the Program

 

The program expects the following command-line arguments:

 

  • The hostname of the SAS Viya server.
  • The metadata level and additional filters.
  • The path to the PEM file for TLS certificate verification.

 

Metadata Level dataDictionaryAndProfile

 

In a Bash terminal on a Windows machine, you can run the program with command-line arguments:

 

# Certificate on a Windows machine and executable is python
python download_metadata.py https://sas_viya_url "?filter=startsWith(name,'detailed')&?filter=contains(type,cas)&level=dataDictionaryAndProfile&limit=10" "C:\\Users\\myuser\Downloads\\gelenv_trustedcerts.pem"

 

Running the program with the provided parameters will download the metadata and save it in a CSV file.

 

The filter downloads only metadata for data assets:

 

  • Where name starts with ‘detailed
  • Their type = ‘cas’. This is broader than just a CAS table, it implies a data asset, a table or a file.
  • Limit to 10 (data assets) matching the criteria.
  • With the metadata level dataDictionaryAndProfile.

 

A SAS Viya certificate is used here, in the form of a PEM file. The PEM file was copied in the 'C:\Users\...\' folder.

 

In a Bash terminal on a Linux machine, the above statement would be:

 

# Certificate on a Linux machine and executable is python3
python3 download_metadata.py https://sas_viya_url "?filter=startsWith(name,'detailed')&?filter=contains(type,cas)&level=dataDictionaryAndProfile&limit=10" /home/cloud-user/.certs/gelenv_trustedcerts.pem

 

A SAS Viya certificate is used here, in the form of a PEM file. The PEM file is assumed to be present in the '/home/myuser/' folder.

 

Output

You will see many valuable characteristics at a column level, such as:

 

  • The calculated classification.
  • If private data was detected in the column.
  • Languages detected.
  • Sentiment detected.
  • Keywords extracted.
  • Classic profiling measures, etc.

 

02_BT_200_Catalog_APIS_Download_dataDictionaryAndProfile-1-1024x433.png

 

The table metadata, such as table keywords, tags, most important columns, privacy and so on, is repeated for each column.

 

Metadata Level dataDictionary

 

In a Bash terminal on a Windows machine, you can run the program with command-line arguments.

 

python download_metadata.py https://sas_viya_url "?filter=startsWith(name,'detailed')&?filter=contains(type,cas)&level=dataDictionary&limit=10" "C:\\Users\\myuser\Downloads\\gelenv_trustedcerts.pem"

 

Output

You shall see simple column metadata. The table metadata is repeated for each column.

 

03_BT_210_Catalog_APIS_Download_dataDictionary-1-1024x433.png

 

Metadata Level - detailedMetrics

 

In a Bash terminal on a Windows machine, you can run the program with command-line arguments.

 

python download_metadata.py https://sas_viya_url "?filter=startsWith(name,'detailed')&?filter=contains(type,cas)&level=detailedMetrics&limit=10" "C:\\Users\\myuser\Downloads\\gelenv_trustedcerts.pem"
 
Output

You will see many values and detailed characteristics at a column - metric - value level.

 

04_BT_220_Catalog_APIS_Download_detailedMetrics-1-1024x433.png

 

Conclusion

 

This program demonstrated how to use the SAS Viya REST API to download the metadata at column level. Stay tuned for more in-depth discussions and tutorials in this series.

 

Acknowledgements

 

  • Nancy Rausch, R&D.  
  • Lavanya Ganesh, R&D.  
  • @AchalPatel, early pioneer of the Download Catalog API. 

 

Additional Resources

 

You might find the following resources helpful. Read:

 

 

Watch:

 

 

Thank you for your time reading this post. If you liked the post, give it a thumbs up! Please comment and tell us what you think about having conversations with your data. If you wish to get more information, please write me an email.

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎02-25-2024 07:54 PM
Updated by:
Contributors

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags