One of the key features of SAS Viya is its integration with open-source languages such as Python and R. This open integration allows users to leverage their existing code and programming skills to speed their time to value with SAS Viya. The integration of SAS Viya with open-source software obviously depends on there being an installation of Python/R that the SAS Viya administrator configures SAS Viya to use.
To make installing and managing open-source installations easier for SAS Viya administrators, SAS provides the SAS Configurator for Open Source. The SAS Configurator for Open Source is a utility application that simplifies the download, configuration, building, and installation of Python and R from source. The results are a Python and R build that is located in a Persistent Volume Claim (PVC). The PVC and the builds that it contains can then be referenced by pods that require Python and R for their operations.
In this post, we will look at how an administrator uses the SAS Configurator for Open Source to build Python and R installs for use with SAS Viya.
Aside: Before I go on, I apologize up front for the length of this post. Even though it is long, I have not covered every aspect of using the SAS Configurator for Open Source. The best place for additional information is in $deploy/sas-bases/examples/sas-pyconfig/README.md which you can find in your deployment assets.
Once configured, the SAS Configurator for Open Source creates and executes a sas-pyconfig job that
The SAS Configurator for Open Source includes the ability to build and install multiple Python and R builds in the same PVC. In order to handle multiple builds, it uses profiles, which can used as references to different versions or builds of Python and R located in the PVC.
As you can imagine, downloading and building open-source software can be a resource intensive operation so after its initial execution, the sas-pyconfig job does not run again until a change is detected in the configuration settings for the job.
Let's follow a scenario in which I have an existing SAS Viya deployment and I want to now install Python and R for use with my deployment.
Following the instructions in the $deploy/sas-bases/examples/sas-pyconfig/README.md, I need to
The first step is to copy the example manifest files in $deploy/sas-bases/examples/sas-pyconfig to $deploy/site-config/sas-pyconfig.
export deploy=~/project/deploy/gelcorp
mkdir -p $deploy/site-config/sas-pyconfig
cp $deploy/sas-bases/examples/sas-pyconfig/* "$_"
chmod 755 $deploy/site-config/sas-pyconfig/*.yaml
The $deploy/site-config/sas-pyconfig directory will now contain these files.
$ ll $deploy/site-config/sas-pyconfig/
-rwxr-xr-x 1 cloud-user cloud-user 1319 Oct 27 17:25 change-configuration.yaml
-rwxr-xr-x 1 cloud-user cloud-user 1161 Oct 27 17:25 change-limits.yaml
-r--r--r-- 1 cloud-user cloud-user 20326 Oct 27 17:25 README.md
To enable the sas-pyconfig job itself and to enable the job to install Python and R, edit $deploy/site-config/sas-pyconfig/change-configuration.yaml and at the top of the file
You do not have to build both Python and R. If you only need one of the languages simply set the one you do not need to "false".
patch: |-
- op: replace
path: /data/global.enabled
value: "true"
- op: replace
path: /data/global.python_enabled
value: "true"
- op: replace
path: /data/global.r_enabled
value: "true"
- op: replace
path: /data/global.pvc
value: "/opt/sas/viya/home/sas-pyconfig"
...
The global.pvc value specifies the mount point within the SAS Configurator for Open Source job pod. This is the location of PVC in the job pod and is the installation location of Python and R profiles.
Lower in the same change-configuration.yaml file are various Python configuration settings that can be used to create different Python profiles that can be configured differently for specific user needs. The default profile is called "default_py" so you will notice that the options defining the default profile all include a reference to the "default_py" profile in the path value. Here, the options install Python 3.8.13 for the default profile but I could easily create a second profile named "python2" and configure a second set of configuration settings to install Python 2 if that was needed for some reason. The install_packages value allows me to specify the set of Python package I want installed for the default profile. If I need to add or remove libraries later, I simply modify the default_py.pip_install_packages value and the sas-pyconfig job will update my installation.
There are additional options in change-configuration.yaml but these are ones affecting the Python installation.
- op: replace #Space delimited list of Python profiles to create
path: /data/global.python_profiles
value: "default_py"
- op: replace # Python build config options
path: /data/default_py.configure_opts
value: "--enable-optimizations"
- op: replace #Python build flags
path: /data/default_py.cflags
value: "-fPIC"
- op: replace # Packages that wheel will build from scratch rather than use binary builds
path: /data/default_py.pip_install_nobinary
value: "Prophet sas_kernel"
- op: replace # Packages that will be installed by PIP.
path: /data/default_py.pip_install_packages
value: "sas_kernel matplotlib sasoptpy sas-esppy NeuralProphet scipy rpy2 Flask XGBoost TensorFlow pybase64 scikit-learn statsmodels sympy mlxtend Skl2onnx nbeats-pytorch ESRNN onnxruntime opencv-python zipfile38 json2 pyenchant nltk spacy gensim"
- op: replace # Used to verify the Python source download
path: /data/default_py.python_signer
value: https://keybase.io/ambv/pgp_keys.asc
- op: replace # Used to verify the Python source download
path: /data/default_py.python_signature
value: https://www.python.org/ftp/python/3.8.13/Python-3.8.13.tgz.asc
- op: replace # Python tarball to install
path: /data/default_py.python_tarball
value: https://www.python.org/ftp/python/3.8.13/Python-3.8.13.tgz
In the same $deploy/site-config/sas-pyconfig/change-configuration.yaml file you will find another set of options to configure the R install. As with Python, R can be configured with multiple profiles, each of which would require a separate set of configuration options.
- op: replace #Space delimited list of R profiles to create
path: /data/global.r_profiles
value: "default_r"
- op: replace # R build config options
path: /data/default_r.configure_opts
value: "--enable-memory-profiling --enable-R-shlib --enable-BLAS-shlib --with-blas --with-lapack --with-readline=no --with-x=no"
- op: replace #R build flags
path: /data/default_r.cflags
value: "-fPIC"
- op: replace # R tarball to install
path: /data/default_r.r_tarball
value: https://cran.r-project.org/src/base/R-4/R-4.2.0.tar.gz
- op: replace # Packages that will be installed for R
path: /data/default_r.packages
value: "dplyr jsonlite httr tidyverse randomForest xgboost forecast"
Now that I have the Python and R installs configured, I need to edit $deploy/site-config/sas-pyconfig/change-limits.yaml to provide the sas-pyconfig job with enough resources to carry out the builds.
The default resource request in the example manifest will configure requests 4 CPU cores and 3000Mi of memory without any upper limits. Unfortunately, those requests are out of range for the size of my research system nodes. For my deployment, I specified the resource requests and limits shown below which allowed the sas-pyconfig job to be scheduled by Kubernetes and the job complete successfully.
You may need to play with the requests and limits values to find the best fit for your deployment. The sas-pyconfig job will fail if you do not set CPU limits to at least 4 CPU cores (4000m). Reducing the resources for the sas-pyconfig job will, of course, affect the time it takes to complete. Because the sas-pyconfig job is typically an infrequent expense, you have some flexibility here and you may be able to afford longer running job time if your deployment is short on resources.
The duration of the sas-pyconfig job is heavily dependent on the resources you provide to the job pod, whether you build both languages or only one, and the number of additional packages you install.
---
apiVersion: builtin
kind: PatchTransformer
metadata:
name: sas-pyconfig-limits
patch: |-
- op: replace
path: /spec/jobTemplate/spec/template/spec/containers/0/resources/requests/cpu
value:
500m
- op: replace
path: /spec/jobTemplate/spec/template/spec/containers/0/resources/requests/memory
value:
1000Mi
- op: replace
path: /spec/jobTemplate/spec/template/spec/containers/0/resources/limits/cpu
value:
4000m
- op: replace
path: /spec/jobTemplate/spec/template/spec/containers/0/resources/limits/memory
value:
3000Mi
target:
group: batch
kind: CronJob
name: sas-pyconfig
version: v1
The final step to enable the sas-pyconfig job is to add both change-configuration.yaml and change-limits.yaml to the transformers field of the base kustomization.yaml.
transformers:
...
- site-config/sas-pyconfig/change-configuration.yaml
- site-config/sas-pyconfig/change-limits.yaml
...
Because you have made changes to kustomization.yaml, you will need to rebuild your SAS Viya deployment and apply the changes to your cluster. You will need to follow the process necessary for your particular situation depending on the deployment method you employ.
See Modify Existing Customizations in a Deployment for guidance on this task.
If you deployed using the viya4-deployment GitHub project, you should consult the project documentation for guidance.
If your deployment uses the Deployment Operator, the sas-pyconfig job will execute automatically with your configured changes.
If your deployment is manually managed, you will need to execute the job yourself by running a command similar to this after applying the updated configuration to your cluster.
kubectl create job sas-pyconfig-adhoc -n --from cronjob/sas-pyconfig
When you apply the updates to your SAS Viya deployment the sas-pyconfig job will execute. It usually takes several minutes for the job to start but using OpenLens or kubectl, you can monitor the sas-pyconfig job pod. There will probably be an existing sas-pyconfig job pod so you will likely see the old job pod terminate. This will be immediately followed by the new job pod starting, running, and eventually succeeding.
$ kubectl get pod --selector app.kubernetes.io/name=sas-pyconfig --watch
NAME READY STATUS RESTARTS AGE
sas-pyconfig-cjinitial-45r8h 0/1 Completed 0 108m
sas-pyconfig-cjinitial-45r8h 0/1 Terminating 0 111m
sas-pyconfig-cjinitial-l8xfz 0/1 Pending 0 0s
sas-pyconfig-cjinitial-l8xfz 0/1 ContainerCreating 0 0s
sas-pyconfig-cjinitial-l8xfz 1/1 Running 0 2s
sas-pyconfig-cjinitial-l8xfz 0/1 Completed 0 70m
You probably noticed that the job pod names include 'cjinitial'. The SAS Configurator for Open Source actually creates an unscheduled sas-pyconfig cronjob that is used as a template for actual executions of the work.
I captured the CPU and Memory utilization from this run of the job and you can see the high-water marks for CPU and Memory that justify the values from step #3 above. The first hump is for the build of Python, the second is for R. Neither resource is maxed out for the duration of the job but you must specify limits high enough to account for the peak usage.
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
If we now look at the contents of the sas-pyconfig PVC we will see the following items have been created by the sas-config job.
$ ls -al
drwxrwxrwx 4 root root 113 Oct 27 21:48 .
drwxr-xr-x 31 nfsnobody nfsnobody 4096 Oct 27 20:41 ..
lrwxrwxrwx 1 sas sas 56 Oct 27 21:02 default_py -> /opt/sas/viya/home/sas-pyconfig/Python-3.8.13.1666917490
lrwxrwxrwx 1 sas sas 50 Oct 27 21:48 default_r -> /opt/sas/viya/home/sas-pyconfig/R-4.2.0.1666917490
-rwxr-xr-x 1 sas sas 32 Oct 27 21:48 md5sum
drwxr-xr-x 8 sas sas 83 Oct 27 20:50 Python-3.8.13.1666917490
drwxr-xr-x 5 sas sas 43 Oct 27 21:11 R-4.2.0.1666917490
The number 1666917490 at the end of the Python and R directories is a datetime value in Unix epoch format for when the sas-pyconfig job ran. This ensures that future updates have a unique directory name so the symlink can be updated properly.
...change the version of Python or R or add additional packages to either language?
You simply need to repeat steps #2 (edit change-configuration.yaml) and #4 (rebuild and apply). Using the md5sum file, the sas-pyconfig job will detect that you have modified something in your Python or R configurations and it will carry out the new builds.
...only install Python, or only R?
In step #1 you can set global.r_enabled = "false" to prevent R from building or set global.python_enabled = "false" to prevent Python from building.
A second option is to edit the sas-pyconfig-parameters configmap to set the values to "false". Keep in mind that this approach is temporary and will be overwritten with the values from change-configuration.yaml the next time you rebuild your deployment and apply it to the cluster.
...prevent the sas-pyconfig job from accidentally updating my Python or R builds?
You can repeat step #1 and set global.enabled = "false" then rebuild and apply. You can also edit the sas-pyconfig-parameters configmap to affect the same change but your edit will be changed back when you next rebuild your deployment and apply it to the cluster.
The SAS Configurator for Open Source utility provides SAS Viya administrators with an easy way to manage builds of Python and R for integration with SAS Viya. Admittedly, this is only part of the overall story for configuring Python and R with SAS Viya. Subsequent posts will describe the process for making Python and R available to SAS Viya users.
Find more articles from SAS Global Enablement and Learning here.
Hey Scott,
Really. helpful article. Recently, I installed the Python packages using this method and I can access Python modules.
FYI..
However, in my case I don't see below "default_py" soft-link.
"default_py -> /opt/sas/viya/home/sas-pyconfig/Python-3.8.13.1666917490"
So every time I run the "sas-pyconfig-adhoc" job I have to update PATH (/opt/sas/viya/home/sas-pyconfig/Python-3.8.13.1666917490/bin/python3) across all other files as this profile PATH (/opt/sas/viya/home/sas-pyconfig/default_py/bin/python3) is not available.
Also, I have noticed that it creates "saspyconfigvol" volume instead of "python-volume".
I'm using 2023.02 so not sure if something has changed in recent version.
Hey Scott,
I'm facing the same exact issue Mayankp has, I'm on stable 2023.03
I noticed 3 strange things:
I will check with the TCS
Maurizio
@mauriziopinzi For me these Python packages (Prophet and ESRNN packages) failed to install. So, after removing them below listed python packages installed successfully. However, Python does not work from SAS Studio due to it failed to create python subprocess.
So, I'm still trying to figure it out.
- op: replace
path: /data/default_py.pip_install_nobinary
value: "Prophet sas_kernel"
- op: replace
path: /data/default_py.pip_install_packages
value: "pystan matplotlib sasoptpy sas-esppy NeuralProphet scipy rpy2 Flask XGBoost TensorFlow pybase64 scikit-learn statsmodels sympy mlxtend Skl2onnx nbeats-pytorch onnxruntime opencv-python zipfile38 json2 pyenchant nltk spacy gensim pandas
pandasql pysqlite3 numpy saspy torch pyreadstat pyarrow pyspark plotly scipy ramp-workflow"
@Mayankp what are you using as storage class RWX? if it is Azure File I think you problem could be related to troughput, there should be timeout parameter for python
It took 5 hours to configure python and I ended up with this error running python code
tcpSelectSelect returned an error in the tkpy extension in connect
which I solved increasing the timeout, something like that
proc python TIMEOUT=300;
submit;
var1 = "'python'"
var2 = 2
SAS.submit("data work.test; x={}; y={}; run;".format(var1,var2))
var3 = SAS.sasfnc("sha256hex","abc")
print("var3 = " + var3)
endsubmit;
run;
@mauriziopinzi I do use the Azure Files with Storage Class RXW. After adding the TIMEOUT=300 the Python has started working. Is there any Python TIMEOUT value configuration in Viya which can be applied at system level rather than at session level? Thanks.
@Mayankp I'm sorry I don't know if it is possible
hi
in our case, we added the
proc python TIMEOUT=3000; run;
in the autoexec context
then, it works For SAS proc Python and also for the .py program (because you need it here also)
Hello
Wondering if the SAS Configurator for open source is a separate package
OR
something built into the deployment and only configuration changes as specified in the presentation need to be done ?
Hi, the SAS Configurator for Open Source is included in the SAS Viya deployment assets and is not available as a stand-alone download. To have SAS Viya build your Python and/or R installations you will need to perform the steps as described.
After this blog was published, the SAS Viya documentation was updated to include more information about integrating SAS Viya with external languages which you may find helpful.
Thanks you for this wonderful and very helpful blog.
My python install is failing due to resource limits. Looks like this is not being read.
I can manually verify if the changes in $deploy/site-config/sas-pyconfig/change-configuration.yaml have been applied to Viya4 by looking at config maps. I see I can edit them too.
I have a question about the change-limit.yaml.
Is there a way by using kubectl utility one can verify if the changes in change-limits.yaml have been applied OR not?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.