BookmarkSubscribeRSS Feed

More than model management with sasctl package for Python

Started ‎05-24-2023 by
Modified ‎05-24-2023 by
Views 1,281

In my previous article, I explained how you can display PDF files inside a SAS Visual Analytics report. That request came from a customer who was willing to load more 10,000 files into the SAS Content Server. They had specific requirements about securing the files and this is the reason why they chose to upload the files and the folder structure into the SAS Content Server and not use alternative options like a standalone container serving the files. This article will not be about the reasons why they chose to upload the files into the SAS Content Server, but rather about how to do it. The approach that we chose is: Python code to automate the upload process.

The business need

As mentioned, the customer has more than 10,000 files to upload and these files are stored in a specific folder structure. They expected to replicate the folder structure they have at OS level to store the files into the SAS Content Server. The customer has Python knowledge and chose that option as it is easy to navigate the file system in Python and easy to call REST APIs. Uploading the files to the SAS Content Server and creating the folder structure can be achieved using the REST APIs surfaced by SAS Viya. SAS has REST APIs for files and folders.

 

The customer wanted a quick solution as the upload process would be a one-time job and there will be no need to upload so many files in the future. This requirement for a fast solution had an impact on the decision to use sasctl package for Python.

The Python package

If you are a data scientist developing models in Python and loading these models into SAS Model Manager, you may already know about the sasctl package for Python. The package was mainly designed to load open-source models into SAS Viya and to handle their deployment into production using the Python language and SAS Viya REST APIs. To serve that purpose, the developers had to create "services" which can be used for model management related tasks, but also for core tasks like handling files and folders as well as authentication. This means that if you are Python developer, this package reduces the time needed to develop a solution as many of the SAS Viya REST APIs endpoints are implemented. And if a specific endpoint is not available, the package also provides a technique to make direct calls to SAS Viya REST APIs which are not defined as a service.

 

Here is a list of the available services:

  • CASManagement
  • Concepts
  • DataSources
  • Files
  • Folders
  • MicroAnalyticScore
  • ModelManagement
  • ModelPublish
  • ModelRepository
  • Projects
  • Relationships
  • ReportImages
  • Reports
  • SASLogon
  • SentimentAnalysis
  • TextCategorization
  • TextParsing
All these services implement the basic CRUD operations: Create, Read, Update, Delete. Each service also implements more specific methods like listing the CAS libraries, executing scoring into Micro Analytics Server (MAS), analyzing sentiment, etc.

 

The solution

Using the sasctl package considerably reduced the time to load the 10,000 plus files and folders structure in just a few lines of code.

 

xab_1_sasctlUpload.fullCode.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

The code details

A bit of explanation might be required about the code.

 

In the first few lines below, the different components are imported.

 

xab_2_sasctlUpload.imports.png

 

  • getpass: assists in the interaction with the end-user to get the password when executing the Python script from the command-line.
  • os: helps in the interactions with the OS and abstracts the OS specificities like using / and \ on different OSs.
  • Session: comes from sasctl and helps handling the connection with the SAS Viya REST API. It will for example handle the authentication.
  • files: provides the needed methods to interact with the files endpoint.
  • folders: implements the methods to interact with the folders endpoint.

The next lines are used to define the configuration properties to connect to the environment (lines 7 to 9).

  • os_path defines the location where the PDFs are located on the file system.
  • sas_path indicates the path into which the folder structure and files should be created.

 

xab_3_sasctlUpload.properties.png

 

The following code block defines a function which will parse the specified OS folder and create two lists: folders_list and files_list.

 

xab_4_sasctlUpload.buildStorageListFunction.png

 

For this specific code, the function will only retrieve the files if they end with ".pdf" as can be seen on line 39.

 

The d_list array contains a list of folders which should be created into SAS Content Server.

 

The f_list array contains a list of file information mapping containing the source file path and the target folder location.

 

So far, the code we have seen doesn't perform any action. It just defines a few parameters and a function. The following is the glue around the different components:

 

xab_5_sasctlUpload.callFunctions.png

 

In line 51, we create a Session for the user. In that session, we create the two lists for folders and files. Using the folder list, we call the folders service to create the folders in SAS Viya. When done the files service is called to create the files in the SAS Content Server using the information from the file_list.

The contribution

While the sasctl package contains a rich set of functionalities, you can extend it with your own contribution. The package is available on GitHub. You can contribute to it and bring extra functionalities like I did with create_path method. When I started helping this customer, there was no function to create a folder in the SAS Content Server if the parent folder wasn't already created. The create_path method implements that functionality a bit like the following command under Linux:

 

 

 

mkdir -p /build/complete/folder/path/newFolder

 

 

 

The code for the method is the following:

 

xab_6_sasctlUpload.newMethod.png

 

If you want to contribute to the project, you follow the instructions provided in the project.

Conclusion

Using packages like sasctl is advantageous as it reduces the time to results and avoids that you, as a developer, reinvent the wheel. The package brings a lot of functionalities that you are not only relevant to model management. The provided services ease the development, but also helps you understand how to interact with SAS Viya REST APIs. Wrapper functions likes the Session is an elegant way to authenticate the user. It allows different authentication mechanism like username/password as we have seen here, and it also implements authinfo based authentication mechanism.

 

The fact that you can contribute to this project is another benefit. You should not be afraid to contribute for different reasons:

  1. If you added a missing functionality, there is likely more people in the SAS world who would also need that functionality.
  2. If your code is not perfect, other developers will help you bring that functionality in as it may make sense for other users to have it.
  3. You don't need to be a full-time programmer to have great ideas.

Most importantly, the sasctl project is yours. You can make what you want of it!

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Thanks for sharing, @XavierBizoux , excellent article, very informative. I wasn't aware how powerful and versatile the sasctl Python package can be for SAS Viya., as a complement to viya CLI, ARK ou pyviyatools.  

Version history
Last update:
‎05-24-2023 03:54 AM
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags