BookmarkSubscribeRSS Feed

A 'preflight check' macro for Python in a SAS session

Started ‎02-21-2024 by
Modified ‎03-01-2024 by
Views 749

I gave myself what I considered a simple task - to design a macro that checks whether Python is available to a given compute or batch session.  As things turned out, I learnt a bit about SAS system options and environment variables used inside SAS sessions. 

 

In a cloud-native world, most environments can be considered ephemeral.  Workloads can be intended for more than one predesignated compute environment.  Those environments may have different configuration.  

 

Preflight checks, run through SAS macros prior to the main execution code, are extremely useful in assessing an environment for necessary characteristics.

 

Also, I found this task useful because it allowed the program to fail gracefully if it couldn't find Python.  

 

Graceful failure, if accompanied with right level of log messages, can help the developer quickly take remedial steps.  It also helps avoid 'cleanup' or rollback situations where part of a program has already run and datasets may have been modified or created.

 

 

Python Check Macro.jpg

 

 

Access the Macro 

 

The macro can be accessed from a GitHub repository I maintain for various utility SAS programs.  I also use them in underlying code of many SAS Studio Custom Steps (low-code SAS Studio components which promote ease of use, code reusability and automation).  Some of my custom steps happen to have proc python blocks, and in future, you may expect to see me including this macro in those steps.

 

  • Link to the Python check macro program:  click here
  • Link to the GitHub repository (of utility programs):  click here
  • Link to an example test code for the above macro: click here 

 

 

How does it work?

 

We'll not venture too deeply into the inner workings of the macro's code here (which, at the end of the day,  is pretty simple), but highlight key decision variables which help in determining access to Python.  These can be understood through the following questions:

 

Does the SAS session know where Python is located?

 

This is informed by an environment variable called PROC_PYPATH.  As Scott McCauley describes the process in his article on configuring Python in SAS Viya,  PROC_PYPATH is set when configuring SAS Viya to access open-source languages, and provides a path to a Python executable invoked whenever PROC PYTHON is run.

 

 

Even if specified, does the Python executable really exist?

 

Environments can break and it's possible Python might have never been installed, installed incorrectly or installed somewhere else.  The macro carries out a check of the contents of the PROC_PYPATH variable to check if the python executable file (e.g. python3) mentioned actually exists and is known to the session.  Note that in batch sessions, sometimes, a path to Python may have not been mounted as a volume, an error situation which could be identified in this case.

 

 

Is LOCKDOWN enabled or not?

 

LOCKDOWN is a security-centric status in SAS servers which disables certain operations and access methods to protect the system.  Settings that allow access to external environments and languages like Python are disabled by default and have to be explicitly enabled.  Certain environmental variables control whether LOCKDOWN is enabled or not.  These include COMPUTESERVER_LOCKDOWN_ENABLE & BATCHSERVER_LOCKDOWN_ENABLE, applied to compute and batch server sessions, respectively.  Note that when they are set to 0,  LOCKDOWN is disabled for that SAS server!  This is not a desirable situation (even though it means that Python can run in that session) because it carries potential for compromise from a security perspective.

 

 

Are methods required to run Python enabled?  

 

A final check is to ensure the following three access methods - PYTHON, PYTHON_EMBED and SOCKET - have been enabled.  This means they would form part of the values in an environment variable called VIYA_LOCKDOWN_USER_METHODS.   Although PYTHON_EMBED is specific to one way of running Python (using a submit block in PROC PYTHON), we include it as part of the check all the same.  You can edit PYTHON_EMBED  out if you don't want to perform this check.

 

When all these checks pass, a set of macro variables, passed as arguments to the macro, are populated with values that indicate there is no error, and also a description that states that a path to Python is available in this compute session and that Python has been enabled.

 

Even with all these checks, the macro should not be considered foolproof.  One callout is when Python is accessed in the SAS Cloud Analytics Services (CAS) server through proc cas inside a compute server session.  Even though the compute server is used in this case, the Python under question refers to the environment available to CAS, and may be further governed by a SAS External Language Settings File (EXTLANG).   Extlang provides its own messages back to the calling program in case a user without necessary privileges attempts to run Python through a CAS action.  

 

At the same time, it seems a safe assumption to say that remaining situations warranting further checks are rare and can be considered edge cases.  Do feel free to write in if you have additional suggestions and pointers which can improve this macro!

 

 

 

Specify the macro within your SAS programs

 

The simplest way to use this macro within a SAS program would be to directly copy and paste it in your SAS program and then call the same.  However, as you may have noticed, the macro's code is pretty long (including comments :)).   Here's an alternative method which hopefully makes it easier to define the macro.  It uses the Filename statement which creates a reference to the URL where this macro is located, and then "includes" it in the SAS program.  This inclusion causes the macro to be specified (but not executed, yet) in your SAS session.

 

filename getsasf URL "https://raw.githubusercontent.com/SundareshSankaran/sas_utility_programs/main/code/Check_For_Python/macro_python_check.sas";
%include getsasf;
filename getsasf clear;

Of course, ensure you have a connection to the GitHub repository (you should, as long as your application's connected to the internet).  There's always copy-paste as your best friend should you find things difficult.

 

Where, within a SAS program, would you specify and call this macro?  While preferences and structures vary, I have found that it's useful to divide your code into "function code" and "execution code".  "Function code" is usually defined upfront and tends to consist of macros, any user-defined functions, or other modularised elements you would like to call in your "execution code".  You may like to define the macro within your "function code" and then call the macro (next section) in your execution code.  Of course, this is only a suggestion.  Use this wherever you like, as long as it works for you! :).

 

 

Call the macro

 

You typically call the macro at the start of your execution code.  First, define the following macro variables (you can name them whatever you like).

 

1. A macro variable for an error flag : Specify this variable as global so that it can be used downstream.  This macro variable represents a flag with a value of 0 indicating no errors from the check, and a value of 1 indicating some error.


2. A macro variable for an error message: Specify this as a global variable too. This is mean to hold a description of the error (or the absence of an error) that may have occurred. 

 

An example is shown below:

 

%global python_error_flag;
%global python_error_desc;

 

Next, call the macro.  You have a choice here.  The check depends on the type of SAS server - whether a compute server or a batch server - you happen to execute your workload from.   A compute server is the type of environment used when you open applications such as SAS Studio in the Viya platform.  A batch server is typically used for batch submissions made using the sas-viya command line interface (CLI) in batch mode.  Organisations may in some cases like to develop code using compute servers and then schedule them to run in batch.

 

If you neither know nor can control the target server,  the _env_check_python macro can be used.  It makes a determination about the server where the code runs and calls the relevant macro.

 

/* Note that the names of the error flag and error description macro variables are quoted when sent over as arguments - this is required.*/;

%_env_check_python("python_error_flag","python_error_desc");

An important note is that the names of the error flag and error description macro variables are quoted when provided as arguments.  The macro is designed to take their names as references to the variables that need to be used.  

 

If you do happen to know the type of target server, then either the _env_check_python_compute or the _env_check_python_batch macro can be called directly.  The syntax is the same.   For example, 

 

/* In case of a compute server */;

%_env_check_python_compute("python_error_flag","python_error_desc");


/* In case of a batch server */;

%_env_check_python_batch("python_error_flag","python_error_desc");

 

Here's an example reference to the macro variables and result (for a successful identification of a path to Python).   Error descriptions, when error situations occur, differ according to the circumstances and stage of the check where they were found.  A quick read of the macro will provide you different error messages.

 

 

Screenshot 2024-02-21 at 16.13.35.png

 

 

Acknowledgements

 

Thanks go out to a number of people who helped me in learning more about the variables that define access to Python, either directly or as a sounding board.  Thanks especially to Wilbram Hazejager (@Wilbram-SAS) who identified potential improvements to an initial, lazy attempt and set me off on a path to find out what really goes on with all those options and environment variables.  Also many thanks to Quan Zhou, Bengt Pederson, Rob Collum, Edoardo Riva, Doug Haigh and others who helped me. 

Comments

@Sundaresh1  I wanted to try it out, so put the following, pls see below. But not output is generated and we have Python on our environment. Am I not executing your code in a right way?

%global error_flag;
%global error_desc;

%macro _env_check_python_compute(errorFlagName, errorFlagDesc);

   data _null_;
      /* ----------------------------------------------------------------------------------------------* 
         Obtain system options and store them inside macro variables.
      *----------------------------------------------------------------------------------------------- */
      proc_pypath = sysget('PROC_PYPATH');
      viya_lockdown_user_methods = sysget('VIYA_LOCKDOWN_USER_METHODS');
      compute_enable = sysget('COMPUTESERVER_LOCKDOWN_ENABLE');
      does_file_at_pypath_exist=fileexist(proc_pypath);

      /* ----------------------------------------------------------------------------------------------* 
         Let's start from the end
         Check if PROC_PYPATH exists
      *----------------------------------------------------------------------------------------------- */

      if proc_pypath = "" then do;
         call symputx(&errorFlagName.,1);
         call symput(&errorFlagDesc., "PROC_PYPATH environment variable not populated, indicating that Python may not have been configured.");
      end;
      else do;
         /* -------------------------------------------------------------------------------------------* 
            Check if PROC_PYPATH points to a valid file
         *-------------------------------------------------------------------------------------------- */
         if does_file_at_pypath_exist = 0 then do;
            call symputx(&errorFlagName.,1);
            call symput(&errorFlagDesc., "The file referred by PROC_PYPATH does not exist, indicating path to Python may have been configured incorrectly.");             
         end;
         else do;
            /* -----------------------------------------------------------------------------------------* 
               Check if COMPUTESERVER_LOCKDOWN_ENABLE = 0, indicating a permissive (and potentially 
               insecure) environment.
            *------------------------------------------------------------------------------------------ */
            if compute_enable = '1' then do;
               /* --------------------------------------------------------------------------------------* 
                  Check if PYTHON and SOCKET appear in viya_lockdown_user_methods.
                  There's an additional PYTHON_EMBED option which is included as a strict check (enabling 
                  Python to run in a submit block).
               *--------------------------------------------------------------------------------------- */
               if index(lowcase(viya_lockdown_user_methods),"python") > 0 and index(lowcase(viya_lockdown_user_methods),"socket") > 0 and index(lowcase(viya_lockdown_user_methods),"python_embed") > 0 then do;
                  call symput("PROC_PYPATH", proc_pypath);
                  call symputx(&errorFlagName.,0);
                  call symput(&errorFlagDesc., "A path to Python is available in this compute session and Python use is part of Viya enabled methods.") ;
               end;
               else do;
                  call symputx(&errorFlagName.,1);
                  call symput(&errorFlagDesc., "Required access methods to run Python don't seem to form part of the user methods allowed in Viya. Please take steps to enable PYTHON, PYTHON_EMBED and SOCKET");             
               end;
            end;
            else do;
               call symput("PROC_PYPATH", proc_pypath);
               call symputx(&errorFlagName.,0);
               call symput(&errorFlagDesc., "A path to Python is available in this compute session and COMPUTESERVER_LOCKDOWN_ENABLE is disabled. While you can run Python, note that setting COMPUTESERVER_LOCKDOWN_ENABLE to 0 is not recommended.");
            end;
         end;
      end;
   run;
%mend _env_check_python_compute;

%_env_check_python_compute("error_flag","error_desc")

%put &error_flag.;
%put &error_desc.;

%put &PROC_PYPATH.;

Hi @touwen_k ,

 

Thank you for trying it out.  

I copied your code and got the following output.  At first, PROC_PYPATH didn't resolve since it wasn't declared as global.  I mentioned this as global in the docstring but didn't explicitly add a %global PROC_PYPATH statement, thank you for pointing it out.

 

Regardless of above,  the values of the two macro variables do show up in the log.  A suggestion would be to show them as notes, as below, in order to make them more prominent.

 

%put NOTE: The value of error_flag is &error_flag ;
%put NOTE:  &error_desc ;

If you do not observe anything in the log, I'd suggest 'breaking the macro' and running the statements in smaller blocks.  Starting with the top level if block and then getting into the nested blocks.  Let me know in case of any help.  Sundaresh.sankaran@sas.com

 

Log when PROC_PYPATH was not declared as global:

 

Sundaresh1_0-1710164658205.png

Log when PROC_PYPATH is declared as global:

Sundaresh1_1-1710164682793.png

 

Version history
Last update:
‎03-01-2024 04:34 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started