BookmarkSubscribeRSS Feed

SAS Content Folders : Make a list (something I never do!)

Started ‎01-30-2023 by
Modified ‎01-30-2023 by
Views 1,167

It's the end of January, when all the good resolutions have already been broken, but it's still fashionable to make new ones!  I, for one, have resolved to be more organised in the things I do, and a core habit is to make a list or create an inventory.

 

What better application of the same, than when you plan to migrate SAS Viya content, either to a repository, or to another (maybe a higher version) Viya environment!

 

While there exist good resources to help with transferring content (I recommend starting with Gerry Nelson's post ),  this post provides an example of how to take an initial preparatory step.  When negotiating the tough landscape of production-level SAS environments, some ugly truths you get to witness are : 

 

  1. Litter & Sprawl!: Content is dispersed across user folders and shared folders.  Some developers are ultra-"Marie Kondo-ish" in their organisation, whereas others have the equivalent of a teenager's closet!
  2. Technical debt: The SAS Viya environment is filled with content from the early days when it was still a development environment.
  3. Many cooks: Everybody has their own version of a "<insert your business problem here> model", and, hey, guess what? They chose to give it the same name ! 🙂

 

As organisations modernise and move to cloud-native environments, it's important to package content for seamless movement across.  In order to carry this out in an efficient manner, it's equally important to prepare an inventory first.

 

 

An example utility: a SAS Content Folder Crawler

 

Although a user may experience it as an operating system's folder structure, SAS Content files are actually stored in a database called the Infrastructure Data Server, and are accessible from various endpoints from the files service in Viya. In the SAS Viya user interface, you can access content files from either SAS Drive or SAS Environment Manager, where they are given address paths. Users refer to these folder paths in SAS Viya applications, but rarely know the actual URI of the file within the files service.

 

In order to make a list of items in SAS Content folders, we present a Python script which takes a top level folder as an argument, and then crawls through the subfolders in a recursive manner and creates a list of the content within.  You can choose to execute the following snippets of code in a Jupyter notebook, or a Python editor.  While this script is written in Python, it can be written in other languages such as shell script or JavaScript.

 

Note that this is only an example, and is not meant to be considered as an official approach regarding how to access and list out SAS Content.  Any future changes in the endpoints of the underlying SAS Viya services may necessitate changes to this script.

 

Import Packages and provide initial parameters.

 

 

 

 

 

import os
import requests
import csv

viya_url="https://provide-your-viya-environment-url-here.com"

access_token = "Bearer <provide an access token here>"

 

 

 

 

 

 

Note the variables viya_url and access_token - these need to be populated by you. Since there are a variety of approaches to obtain an access token, I didn't want to exclude any others by including one specific approach here.  Refer to this post in order to find out how to obtain an access token which is used to authenticate your REST API calls to SAS Viya.

 

 

Provide a starting folder path

 

 

 

 

 

startFolderPath="/folders/folders/some-crazy-alphanumeric-sequence"

 

 

 

 

 

 

We all have to start somewhere 🙂.  In order to set off the crawler, let's start with one top-level folder in SAS Content. A popular example is the folder Public within SAS Content.  It's typically the place where a lot of shared projects get saved. In order to find the URI of the Public folder in your server, 

 

1. Go to SAS Environment Manager (top left menu -> Manage Environment)

2. Click on the Content tab (from the menu on the left hand side, third from top)

3. Enter SAS Content

4. Click once on Public.  The folder URI appears on the right hand side, which you can copy and use.

 

Here's a screenshot for reference.

 

Screenshot 2023-01-30 at 8.14.30 PM.png

 

Helper Functions

 

Only a brief explanation is provided for the different functions provided below, but hopefully they are easy to understand.

 

makeListOfPackages - main recursive function which will be called with a starting folder URI

 

def makeListOfPackages(folderUri,accessToken,parentFolderUri):
    res=getResource(folderUri,accessToken)
    folderName=res["name"]
    parentFolderUri=parentFolderUri
    if res["memberCount"] > 0:
        resMembers=getMembers(folderUri,accessToken,0,res["memberCount"])
        for eachMember in resMembers:
            contentType=eachMember["contentType"]
            if contentType != "folder":
                memberResult=writeToList(eachMember,folderName)
                allMemberResults.append(memberResult)
            else:
                makeListOfPackages(eachMember["uri"],accessToken,parentFolderUri)

 

The above is known as a recursive function, i.e. it calls itself until a condition is met.  We first provide the top-level folder's URI as an input to this function (along with an access token). Having obtained details about the top level folder first, the function then tries to see if there are any members (either sub-folders or other files) within this folder.  For each member pulled, a check is made if the member is a folder or not. If it happens to be a file (i.e. not a folder), then it is written to a list (more on that function later).  Otherwise, the recursion sets in and the function calls itself with the new folder's path.

 

getResource - extracts all metadata for a given folder content

 

def getResource(folderUri,accessToken):
    headers={'Authorization':accessToken}
    resource=requests.get(viya_url+folderUri, headers=headers, verify="a-path-to-your-SSL-certs.pem")
    resource=resource.json()
    return resource

 

This particular function is a simple call which takes a folder URI as input and gets the contents of the resource associated with the same.  The resource is returned in JSON format. 

A note on the (..verify="a-path-to-your-SSL-certs.pem") parameter : SAS Viya 4 servers are stricter in terms of SSL certificate verification when receiving traffic over HTTPS.  It is recommended that the Viya server certificates are known to the client (Python) calling the API.

 

getMembers - extracts all members (contents) within a given folder

 

def getMembers(folderUri,accessToken,start,limit):
    headers={'Authorization':accessToken}
    resource=requests.get(viya_url+folderUri+"/members?start="+str(start)+"&limit="+str(limit), headers=headers, verify="a-path-to-your-SSL-certs.pem")
    resource=resource.json()["items"]
    return resource

 

If a folder contains members, the above function extracts a list of these members through calling the members endpoint.  Note that for folders with a long list of items, only the first n (10 or 20 items) are returned by default, hence the start and limit parameters provided which ensure all members are returned.  Keep in mind that this may result in a very long JSON, and therefore if you would like to paginate the same, please go ahead.

 

writeToList - writes extracted non-folder members to a dictionary

 

def writeToList(eachMember,folderName):
    memberResult={"contentType":"","contentName":"","folderName":"","parentFolderUri":"","contentUri":""}
    memberResult["contentType"]=eachMember["contentType"]
    memberResult["contentName"]=eachMember["name"]
    memberResult["folderName"]=folderName
    memberResult["parentFolderUri"]=eachMember["parentFolderUri"]
    memberResult["contentUri"]=eachMember["uri"]
    # print(memberResult)
    # above for debug purposes only
    return memberResult

 

The above function creates a list of dictionary items for each member which is passed to it.  I have restricted the keys in the dictionary only to what is necessary for my purpose, but you may choose to add more if you feel necessary.

 

 

Call the main function

 

allMemberResults=[]
makeListOfPackages(startFolderPath,access_token,"Public")

 

It's time to call the main function.  First, I create an empty list to hold all my members, and then call the makeListOfPackages function with the URI for Public as an input parameter.

 

Write Results to CSV file 

 

with open("viya_public_content_example.csv","w") as c:
    writer=csv.writer(c)
    writer.writerow(['contentType','contentName','folderName','parentFolderUri','contentUri'])
    for eachMember in allMemberResults:
        writer.writerow(eachMember.values())

 

I can write all the rows in allMemberResults dict to a csv file.  Note that you may need to change the header rows in case you have chosen to include more fields in your dictionary.

 

Final Results

 

Here's a screenshot of typical results (anonymised).  Now that you have your inventory, you can make better downstream decisions such as identifying pockets of content you may like to migrate as a combined package, filter out some unwanted content, or just obtain inputs to help you organize content better in future!

 

 

Screenshot 2023-01-30 at 8.56.52 PM.png

 

Enjoy browsing through the innards of your SAS Content folders (if that's your cup of tea) and feel free to send a message in case you have any questions!

Version history
Last update:
‎01-30-2023 11:13 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started