BookmarkSubscribeRSS Feed

Using the SAS Configurator for Open Source to Add Packages to Python and R

Started ‎08-03-2023 by
Modified ‎09-08-2023 by
Views 804

In Using the SAS Configurator for Open Source to Build Python and R, I summarized the steps an administrator needs to follow to build open source software for use with the SAS Viya platform. The default configuration of the SAS Configurator for Open Source adds a number of packages we felt users would find most helpful to the core Python and R builds. While the initial list of additional packages provides a great starting point, users will inevitably ask their administrator to add more packages to the builds to support the needs of new projects.

 

I have fielded a few questions lately asking how to add new packages to existing builds of Python and R builds so let's work through how administrators can do that using the SAS Configurator for Open Source.

 

Key Steps

 

Assuming you have existing builds of Python and R you created using the SAS Configurator for Open Source, you should perform these steps to add new packages to Python and R.

 

  1. Modify $deploy/site-config/sas-pyconfig/change-configuration.yaml to add the desired new packages to the existing list of packages.
  2. Update the deployment which will modify the sas-pyconfig cronjob definition to include the new packages.
  3. Execute the sas-pyconfig job
    1. If the deployment uses the Deployment Operator or the sas-orchestration deploy command the sas-pyconfig job will automatically execute.
    2. If using the deploy with Kubernetes commands method, the administrator will need to issue commands to re-run the sas-pyconfig job.
  4. Validate user access to the new packages.

 

Modify the configuration

 

The first step is to modify $deploy/site-config/sas-pyconfig/change-configuration.yaml and add any additional packages your users need to Python and R.

 

As an example, I am going to add the cx_Oracle package to the Python build and the caret package to the R build. Neither of these new packages has documented dependencies, but it is a good idea to read the installation documentation for each package you add in case there are dependent packages you need to include.

 

SMcC_01_addpack_changeConfig.png

 

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Note: SAS Viya platform releases 2023.08 and later will include additional changes to change-configuration.yaml. If you are using release 2023.08 or later and your existing $deploy/site-config/sas-pyconfig/change-configuration.yaml was created in an earlier release, you should

 

  • Make a backup of your current $deploy/site-config/sas-pyconfig/change-configuration.yaml
  • Copy $deploy/sas-bases/examples/sas-pyconfig/change-configuration.yaml to $deploy/site-config/sas-pyconfig
  • Edit the new change-configuration.yaml to include your earlier modifications as well as any new changes you need to make.

 

Apply the configuration change

 

You next need to apply the configuration change you made to update the ConfigMap for the sas-pyconfig cronjob.  Until you do this step, the sas-pyconfig cronjob will continue to build the packages listed before you made your change.

 

You will need to follow the process necessary for your particular situation depending on the deployment method you employ.  

 

See Modify Existing Customizations in a Deployment for guidance on this task.

 

If you deployed using the viya4-deployment GitHub project, you should consult the project documentation for guidance.

 

Execute the sas-pyconfig job

 

If your deployment uses the Deployment Operator or the sas-orchestration deploy method, the sas-pyconfig job will execute automatically and rebuild Python and R with your configured changes.

 

If your deployment is manually managed, you will need to execute the job yourself by running a command similar to this after applying the updated configuration to your cluster.

 

kubectl create job sas-pyconfig-adhoc --namespace your-namespace --from cronjob/sas-pyconfig

 

What happens when the job runs

 

When the sas-pyconfig job executes it will detect that there has been a change made to the change-configuration.yaml by comparing its hash with the hash from the previous configuration stored in the md5sum file.  The job will then rebuild Python and R which is when the new packages will be included.

 

The image below compares the state of the sas-pyconfig persistent volume before sas-pyconfig runs with its state after the job completes. You can see from the lines marked in the 'after change' state that we have new builds for both Python and R, the default_py and default_r links have been updated to reference the new builds, and the md5sum file has been updated so it can detect future changes.  

 

SMcC_02_addpack_pvcChanges.png

 

Validate the change

 

After the sas-pyconfig job completes, you should validate the addition of the new packages. I usually do this by examining the Python and R builds to make sure the new packages were installed and by making sure SAS Viya can reference them.

 

In my deployment, the Python packages I added will appear in the ./Python-3.8.15.1689776107/lib/python3.8/site-packages directory. The listing below only shows a subset of Python packages I have installed but I can see that the cx_Oracle package has been added.

 

SMcC_03_addpack_pyPackages.png

 

Similarly, I can do the same for my R build by examining the ./R-4.2.2.1689776107/lib64/R/library directory to make sure the caret package has been added.  

 

SMcC_04_addpack_rPackages.png

 

Finally, I usually use SAS Studio to validate that the SAS Viya platform can access the new packages. I always try to include in my test one package that was accessible before I made the change as well as any new packages I add.

 

SMcC_05_addpack_testAfter.png

 

You do not want to see log messages stating ModuleNotFoundError from Python or 'there is no package' from R from your test as shown below. This indicates that the expected packages have not been added properly. You should examine the log from your last sas-pyconfig job to see if there was an error during the build processes that prevented the installation from completing successfully.

 

SMcC_06_addpack_testBefore.png

 

SAS Viya platform releases 2023.08 and later will emit more extensive error reporting in the sas-pyconfig job logs which should provide better notification of package installation issues for administrators.

 

A word about removing packages...

 

The steps above focus on adding packages to existing builds of Python and R but the same steps can be used to remove unwanted packages. However, removing packages is a less common request of an administrator as it has the potential to break existing code. Unless there is a significant reason to get rid of an existing package such as it poses a security risk or there are legal implications, administrators should work closely with their user community before the removal of a package to safeguard existing work.

 

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎09-08-2023 10:50 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started