BookmarkSubscribeRSS Feed

Creating Multiple Python and R Profiles using the SAS Configurator for Open Source

Started ‎09-07-2023 by
Modified ‎09-08-2023 by
Views 556

In earlier articles on integrating the SAS Viya platform with Python and R I mentioned that it was possible to define additional profiles that would allow programmers the flexibility to use different versions or configurations of open-source software. Based on questions that I have had recently, there is interest in knowing exactly how to define additional profiles - especially after people watch Sundaresh Sankaran's Python à la carte: Configure repeatable & reusable Python environments video on YouTube.

 

The technique Sundaresh illustrates in his video assumes that the SAS Viya administrator has provided multiple Python profiles that can then be associated with different compute contexts to provide unique programming experiences.  This article will walk you through the steps needed to define those multiple Python and R profiles using the SAS Configurator for Open Source.

 

The setup

I am using a SAS Viya platform deployment of Stable 2023.08 so if you use an earlier release, you may see minor differences along the way.

 

By following the steps described in Using the SAS Configurator for Open Source to Build Python and R, my initial setup has a default_py profile that uses Python 3.9.15 and a default_r profile that uses R 4.2.3.

 

Let's suppose that my user community has asked me to create profiles for Python and R that allow them to use the previous releases of Python and R which were 3.8.15 and 4.2.2 respectively so they can compare results of existing programs.

 

The before state

Just so we can put the later changes in context, this is the state of my sas-pyconfig persistent volume before adding additional profiles.  Notice that we have a default_py symlink to Python3.9.16 and a default_r symlink to R4.2.3.  The symlinks are created to match the names of the defined Python and R profiles.

 


 default_py -> /opt/sas/viya/home/sas-pyconfig/Python-3.9.16.1693336494
 default_r -> /opt/sas/viya/home/sas-pyconfig/R-4.2.3.1693336494
 extlang.xml
 md5sum
 Python-3.9.16.1693336494
 R-4.2.3.1693336494

 

Declare the new profile names

Assuming you followed the steps in Using the SAS Configurator for Open Source to Build Python and R to get started, the only file you need to edit to create additional profiles is $deploy/site-config/sas-pyconfig/change-configuration.yaml.  This file defines a configmap for the sas-pyconfig job so the order of the defined elements does not matter.

 

To make things easier to explain and maintain, I am going to rearrange some of the elements to keep each profile definition separate and explain each section in turn.  In actuality, the following sections are all parts of the one change-configuration.yaml file.

 

At the top of change-configuration.yaml I have the global settings that

  • enable the sas-pyconfig job
  • build both Python and R
  • and most importantly, declare the names of the Python and R profiles I am defining.

In addition to default_py and default_r, the new previous_py profile will be used with the older Python release and the new previous_r profile used with the older R release.

 


apiVersion: builtin
kind: PatchTransformer
metadata:
  name: sas-pyconfig-custom-parameters
patch: |-
  - op: replace 
    path: /data/global.enabled
    value: "true"
  - op: replace 
    path: /data/global.python_enabled
    value: "true"
  - op: replace 
    path: /data/global.r_enabled
    value: "true"
  - op: replace
    path: /data/global.pvc
    value: "/opt/sas/viya/home/sas-pyconfig"
  - op: replace
    path: /data/global.python_profiles
    value: "default_py previous_py"  # Python profile names to be defined
  - op: replace
    path: /data/global.r_profiles
    value: "default_r previous_r"    # R profile names to be defined
	

 

Define the default_py profile

Continuing in change-configuration.yaml, I next have the configuration that defines the default_py profile.  These are just the default values from my 2023.08 deployment.

 

  # default_py profile section 
  - op: replace
    path: /data/default_py.configure_opts
    value: "--enable-optimizations"
  - op: replace
    path: /data/default_py.pip_install_opts
    value: "--force-reinstall"
  - op: replace
    path: /data/default_py.cflags
    value: "-fPIC"
  - op: replace
    path: /data/default_py.pip_install_packages
    value: "Prophet sas_kernel matplotlib sasoptpy sas-esppy NeuralProphet scipy Flask XGBoost TensorFlow pybase64 scikit-learn statsmodels sympy mlxtend Skl2onnx nbeats-pytorch ESRNN onnxruntime opencv-python zipfile38 json2 pyenchant nltk spacy gensim pyarrow hnswlib==0.7.0 sas-ipc-queue great-expectations==0.16.8"
  - op: replace
    path: /data/default_py.pip_r_profile
    value: "default_r"   # profile of R to use with rpy2 package
  - op: replace
    path: /data/default_py.pip_r_packages
    value: "rpy2==3.5.12"
  - op: replace
    path: /data/default_py.python_signer
    value: https://keybase.io/ambv/pgp_keys.asc
  - op: replace
    path: /data/default_py.python_signature
    value: https://www.python.org/ftp/python/3.9.16/Python-3.9.16.tgz.asc
  - op: replace
    path: /data/default_py.python_tarball
    value: https://www.python.org/ftp/python/3.9.16/Python-3.9.16.tgz

 

Define the new previous_py profile

I want the new previous_py profile to be exactly like the default_py profile except I want it to use Python 3.8.15.  To create the new previous_py profile, I just repeated all of the default_py profile elements being sure to change all instances of default_py to previous_py, and modified the python_signature and python_tarball values.

 

Because I am including the rpy2 package, I want the older version of Python to use the older version of R so my users can truly replicate their earlier experience, so it is worth special notice that pip_r_profile is going to be based on the previous_r profile that we are adding.

 

  # previous_py profile section 
  - op: replace
    path: /data/previous_py.configure_opts
    value: "--enable-optimizations"
  - op: replace
    path: /data/previous_py.pip_install_opts
    value: "--force-reinstall"
  - op: replace
    path: /data/previous_py.cflags
    value: "-fPIC"
  - op: replace
    path: /data/previous_py.pip_install_packages
    value: "Prophet sas_kernel matplotlib sasoptpy sas-esppy NeuralProphet scipy Flask XGBoost TensorFlow pybase64 scikit-learn statsmodels sympy mlxtend Skl2onnx nbeats-pytorch ESRNN onnxruntime opencv-python zipfile38 json2 pyenchant nltk spacy gensim pyarrow hnswlib==0.7.0 sas-ipc-queue great-expectations==0.16.8"
  - op: replace
    path: /data/previous_py.pip_r_profile
    value: "previous_r"   # profile of R to use with rpy2 package
  - op: replace
    path: /data/previous_py.pip_r_packages
    value: "rpy2==3.5.12"
  - op: replace
    path: /data/previous_py.python_signer
    value: https://keybase.io/ambv/pgp_keys.asc
  - op: replace
    path: /data/previous_py.python_signature
    value: https://www.python.org/ftp/python/3.8.15/Python-3.8.15.tgz.asc
  - op: replace
    path: /data/previous_py.python_tarball
    value: https://www.python.org/ftp/python/3.8.15/Python-3.8.15.tgz

 

Define the default_r profile

Next, we have the values that define the default_r profile.  These are unchanged from the initial configuration.

 

  # default_r profile section
  - op: replace
    path: /data/default_r.cflags
    value: "-fPIC"
  - op: replace
    path: /data/default_r.configure_opts
    value: "--enable-memory-profiling --enable-R-shlib --with-blas --with-lapack --with-readline=no --with-x=no --enable-BLAS-shlib"
  - op: replace
    path: /data/default_r.r_tarball
    value: https://cloud.r-project.org/src/base/R-4/R-4.2.3.tar.gz
  - op: replace
    path: /data/default_r.packages
    value: "dplyr jsonlite httr tidyverse randomForest xgboost forecast"

 

Define the new previous_r profile

Again, I modeled the new previous_r profile off of the default_r profile, changing only the path values to reference the previous_r profile and the version of R that it builds.

 

  # previous_r profile section
  - op: replace
    path: /data/previous_r.cflags
    value: "-fPIC"
  - op: replace
    path: /data/previous_r.configure_opts
    value: "--enable-memory-profiling --enable-R-shlib --with-blas --with-lapack --with-readline=no --with-x=no --enable-BLAS-shlib"
  - op: replace
    path: /data/previous_r.r_tarball
    value: https://cloud.r-project.org/src/base/R-4/R-4.2.2.tar.gz
  - op: replace
    path: /data/previous_r.packages
    value: "dplyr jsonlite httr tidyverse randomForest xgboost forecast"

 

Just for completeness, the rest of change-configuration.yaml has only the target definition.

 

target:
  version: v1
  kind: ConfigMap
  name: sas-pyconfig-parameters

 

That completes the changes.  You could, of course, modify the package lists or make other changes to the profiles but for now, let's move on.

 

Apply the change

Save your changes and with the configuration complete, you now need to rebuild the SAS deployment and apply the changes to the cluster.  How you do this is dependent upon the deployment method used with your SAS Viya deployment.

 

See Modify Existing Customizations in a Deployment for guidance on this task.

 

If you deployed using the viya4-deployment GitHub project, you should consult the project documentation for guidance.

 

Execute the sas-pyconfig job

If your deployment uses the Deployment Operator or the sas-orchestration deploy method, the sas-pyconfig job will execute automatically and rebuild Python and R with your configured changes.

 

If your deployment is manually managed, you will need to execute the job yourself by running a command similar to this after applying the updated configuration to your cluster.

 

kubectl create job sas-pyconfig-adhoc --namespace your-namespace --from cronjob/sas-pyconfig

 

The after state

Let's take a look at the state of the sas-pyconfig volume at the completion of the sas-pyconfig job.

 

Notice that we now have two Python builds and two R builds, one for each release, and new symlinks whose names match our new profiles.

 

 default_py -> /opt/sas/viya/home/sas-pyconfig/Python-3.9.16.1693436460
 default_r -> /opt/sas/viya/home/sas-pyconfig/R-4.2.3.1693436460
 extlang.xml
 md5sum
 previous_py -> /opt/sas/viya/home/sas-pyconfig/Python-3.8.15.1693436460
 previous_r -> /opt/sas/viya/home/sas-pyconfig/R-4.2.2.1693436460
 Python-3.8.15.1693436460
 Python-3.9.16.1693436460
 R-4.2.2.1693436460
 R-4.2.3.1693436460

 

Validate Python

Let's make sure the programmers can access both releases.  This little SAS program can be run from SAS Studio and should report that the first PROC PYTHON step uses the default_py profile of Python 3.9.16 while the second step uses the new previous_py profile of Python 3.8.15.  The ERROR message is expected and can be ignored.

 

/* to use default_py */
options set=PROC_PYPATH='/opt/sas/viya/home/sas-pyconfig/default_py/bin/python3';
options set=SAS_EXT_LLP_PYTHON='/opt/sas/viya/home/sas-pyconfig/default_py/lib/python3.9/lib-dynload';

proc python;
submit;
import sys
print(sys.version)
exit(0)
endsubmit;
run;
quit;

/* to use previous_py */
options set=PROC_PYPATH='/opt/sas/viya/home/sas-pyconfig/previous_py/bin/python3';
options set=SAS_EXT_LLP_PYTHON='/opt/sas/viya/home/sas-pyconfig/previous_py/lib/python3.8/lib-dynload';

proc python;
submit;
import sys
print(sys.version)
exit(0)
endsubmit;
run;
quit;

 

You should see results such as this.

 

SAS Studio log

 

One special note about this testing approach...the compute server will reuse the first Python subprocess it starts unless you explicitly terminate it with the exit() method.  Without that in the first PROC PYTHON step, both steps will report using the same version of Python.

 

Validate R

Here is a similar program we can use to validate access to both R profiles.  Because the default_r profile is typically configured into the compute servers, you really do not need the first OPTIONS statement but I am including it here for symmetry. 

 

/* use the default_r profile */
*options set=R_HOME='/opt/sas/viya/home/sas-pyconfig/default_r/lib64/R';

proc iml;
submit / R;
v <- R.Version() 
print(v)
endsubmit;
run;
quit;

/* use the previous_r profile */
options set=R_HOME='/opt/sas/viya/home/sas-pyconfig/previous_r/lib64/R';

proc iml;
submit / R;
v <- R.Version() 
print(v)
endsubmit;
run;
quit;

 

Output from the first step should include: $version.string [1] "R version 4.2.3 (2023-03-15)".

 

And output from the second step should include: $version.string [1] "R version 4.2.2 (2022-10-31)".

 

Next steps

Now that you know how to create additional Python and R profiles, you can use the technique Sundaresh shows in his Python à la carte: Configure repeatable & reusable Python environments video to create separate Compute contexts for each profile to make it easier for programmers to select a given configuration.

 

You may also want to check out the SAS Studio Custom steps repository from the SAS Software GitHub repository that can be used to work with Python and R.

 

 

Related Resources

Version history
Last update:
‎09-08-2023 10:47 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started