Using the SAS Configurator for Open Source to Add Packages to Python and R
- Article History
- RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
In Using the SAS Configurator for Open Source to Build Python and R, I summarized the steps an administrator needs to follow to build open source software for use with the SAS Viya platform. The default configuration of the SAS Configurator for Open Source adds a number of packages we felt users would find most helpful to the core Python and R builds. While the initial list of additional packages provides a great starting point, users will inevitably ask their administrator to add more packages to the builds to support the needs of new projects.
I have fielded a few questions lately asking how to add new packages to existing builds of Python and R builds so let's work through how administrators can do that using the SAS Configurator for Open Source.
Key Steps
Assuming you have existing builds of Python and R you created using the SAS Configurator for Open Source, you should perform these steps to add new packages to Python and R.
- Modify $deploy/site-config/sas-pyconfig/change-configuration.yaml to add the desired new packages to the existing list of packages.
- Update the deployment which will modify the sas-pyconfig cronjob definition to include the new packages.
- Execute the sas-pyconfig job
- If the deployment uses the Deployment Operator or the sas-orchestration deploy command the sas-pyconfig job will automatically execute.
- If using the deploy with Kubernetes commands method, the administrator will need to issue commands to re-run the sas-pyconfig job.
- Validate user access to the new packages.
Modify the configuration
The first step is to modify $deploy/site-config/sas-pyconfig/change-configuration.yaml and add any additional packages your users need to Python and R.
As an example, I am going to add the cx_Oracle package to the Python build and the caret package to the R build. Neither of these new packages has documented dependencies, but it is a good idea to read the installation documentation for each package you add in case there are dependent packages you need to include.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Note: SAS Viya platform releases 2023.08 and later will include additional changes to change-configuration.yaml. If you are using release 2023.08 or later and your existing $deploy/site-config/sas-pyconfig/change-configuration.yaml was created in an earlier release, you should
- Make a backup of your current $deploy/site-config/sas-pyconfig/change-configuration.yaml
- Copy $deploy/sas-bases/examples/sas-pyconfig/change-configuration.yaml to $deploy/site-config/sas-pyconfig
- Edit the new change-configuration.yaml to include your earlier modifications as well as any new changes you need to make.
Apply the configuration change
You next need to apply the configuration change you made to update the ConfigMap for the sas-pyconfig cronjob. Until you do this step, the sas-pyconfig cronjob will continue to build the packages listed before you made your change.
You will need to follow the process necessary for your particular situation depending on the deployment method you employ.
See Modify Existing Customizations in a Deployment for guidance on this task.
If you deployed using the viya4-deployment GitHub project, you should consult the project documentation for guidance.
Execute the sas-pyconfig job
If your deployment uses the Deployment Operator or the sas-orchestration deploy method, the sas-pyconfig job will execute automatically and rebuild Python and R with your configured changes.
If your deployment is manually managed, you will need to execute the job yourself by running a command similar to this after applying the updated configuration to your cluster.
kubectl create job sas-pyconfig-adhoc --namespace your-namespace --from cronjob/sas-pyconfig
What happens when the job runs
When the sas-pyconfig job executes it will detect that there has been a change made to the change-configuration.yaml by comparing its hash with the hash from the previous configuration stored in the md5sum file. The job will then rebuild Python and R which is when the new packages will be included.
The image below compares the state of the sas-pyconfig persistent volume before sas-pyconfig runs with its state after the job completes. You can see from the lines marked in the 'after change' state that we have new builds for both Python and R, the default_py and default_r links have been updated to reference the new builds, and the md5sum file has been updated so it can detect future changes.
Validate the change
After the sas-pyconfig job completes, you should validate the addition of the new packages. I usually do this by examining the Python and R builds to make sure the new packages were installed and by making sure SAS Viya can reference them.
In my deployment, the Python packages I added will appear in the ./Python-3.8.15.1689776107/lib/python3.8/site-packages directory. The listing below only shows a subset of Python packages I have installed but I can see that the cx_Oracle package has been added.
Similarly, I can do the same for my R build by examining the ./R-4.2.2.1689776107/lib64/R/library directory to make sure the caret package has been added.
Finally, I usually use SAS Studio to validate that the SAS Viya platform can access the new packages. I always try to include in my test one package that was accessible before I made the change as well as any new packages I add.
You do not want to see log messages stating ModuleNotFoundError from Python or 'there is no package' from R from your test as shown below. This indicates that the expected packages have not been added properly. You should examine the log from your last sas-pyconfig job to see if there was an error during the build processes that prevented the installation from completing successfully.
SAS Viya platform releases 2023.08 and later will emit more extensive error reporting in the sas-pyconfig job logs which should provide better notification of package installation issues for administrators.
A word about removing packages...
The steps above focus on adding packages to existing builds of Python and R but the same steps can be used to remove unwanted packages. However, removing packages is a less common request of an administrator as it has the potential to break existing code. Unless there is a significant reason to get rid of an existing package such as it poses a security risk or there are legal implications, administrators should work closely with their user community before the removal of a package to safeguard existing work.
Find more articles from SAS Global Enablement and Learning here.