SAS Visual Data Mining and Machine Learning now lets you call CAS actions from Python, Java, or Lua code! For some of us, this opens up a whole new world. This blog provides a general introduction to text editors and Jupyter notebooks, for those who may be new to using these tools to organize and run programming code, such as Python code.
Programming code is often developed, stored, shared with others, and even executed via a text editor. There are many choices for text editors, such as Notepad++, Sublime 3, Eclipse, Ultraedit, Atom, Kate, etc. It appears that everybody has their personal favorite. Perhaps someday there will be an American Idol show for text editors.
Today’s text editors have features like syntax highlighting, code collapsing, floating tabs, autocompletion, minimap/document map, color coding, multicursors, default and customized hot keys, tab triggers, convenient document navigation, and the ability to execute code directly from the editor. Text editors are very useful for editing and storing large swaths of code.
Below are examples of Notepad++ and Sublime 3 with Python code to run a neural network in CAS. Notepad++ is free and is a Windows text editor. Sublime 3 is available for Windows, OS X, and Linux, and includes a free trial, but it is not free permanently. The color scheme of text editors can be easily changed, and helps to identify the role of the text string in the program. For example, see screen shots below of Notepad++ and Sublime 3 each set to its respective Monokai color scheme:
To find the programming languages supported in Notepad++, go to Language and click on the first letter of the language you are interested in, such as P for Python, as shown below.
To find the languages supported in Sublime 3, go to View/Syntax.
You can run pure Python code directly from these text editors. However, we are not interested in running pure Python code on its own here. Remember, we want the Python code to call CAS actions, so that we can take advantage of the speed and parallelization that the CAS engine provides.
So why use a notebook?
Text editors let you see, well, text. Notebooks have live code, explanatory text, equations, url links, output, tables, images, graphs, and other rich media, all conveniently displayed in one notebook. Think of an old-fashioned notebook, such as Galileo’s below, where you would commonly find images and equations interspersed with text. FYI, Galileo discovered the four largest moons of Jupiter.
Using a Jupyter notebook, you can clearly annotate your code in easily readable markdown language. This helps to make your work understandable and usable not only by others, but by yourself when you come back to look at it in three months. See my Jupyter notebook example below.
Jupyter notebooks can be configured to let you conveniently run your code from a web browser. Also, keep in mind that we are not just running pure Python code, but we must have the SAS Scripting Wrapper for Analytics Transfer (SWAT) imported to execute code against CAS.
More about Jupyter
Jupyter is a command shell for interactive computing in multiple programming languages, including Julia, Python, and R. It evolved from the IPython project as a set of open-source software tools for interactive and exploratory computing. IPython was created in 2001 by Dr. Fernando Pérez (University of California-Berkeley). Dr. Brian Granger of Tech-X Corporation joined the IPython project in 2004. Jupyter runs on Linux and other Unix-type operating systems, Apple OS X, and Microsoft Windows. It can be accessed on a local desktop or installed on a remote server and accessed through the internet.
Jupyter stores a session’s inputs and outputs into a pair of numbered tables called In and Out, as shown in a very simple example below.
An open Jupyter notebook has exactly one interactive session connected to an IPython kernel, which will execute code sent by the user and send results back. A notebook’s kernel is its computational engine that executes the code contained in the notebook. For example, Jupyter’s IPython kernel executes Python code, its IRkernel executes R code, and its IJulia kernel executes Julia code.
The kernel remains active even if the web browser window is closed. If you reopen the same notebook, it will reconnect the web application to the same kernel. CAUTION: Jupyter Notebook is designed for a single user. Other clients can connect to the same underlying IPython kernel. If you have multiple users and want authentication, you will want to use JupyterHub to manage multiple instances of a single-user notebook.
You can save a session’s inputs and outputs to a log file. You can create aliases for common system tasks, navigate the file system with some of the common Linux commands such as cd and ls, and prefix any command with ! for direct execution by the underlying operating system.
A Few Jupyter Tips
Jupyter also offers a set of control commands called magic commands that improve Python’s usability in an interactive context. Three examples of magic commands are:
You can include images in your Jupyter notebook, as I did with the photo of the planet Jupiter screen-captured earlier in this blog. To do that, your image must be in the same folder as your Jupyter, and then you can simply type into your Jupyter markdown cell:
You can view keyboard shortcuts in Jupyter by going to Help/Keyboard Shortcuts, as shown below:
Jupyter is an intuitive, interactive and exploratory computing tool that allows you to show step-by-step what your programming code is doing. If you are just entering the realm of using Python, Java, or Lua, I recommend that you familiarize yourself with the latest features of your favorite text editor and of Jupyter. A few resources to get you started are listed below.
Then you are on your way to running CAS actions via Jupyter using Python code! As demonstrated in Ryan Gillespie’s 4 minute video, you can take advantage of highly parallelized processes in the CAS analytic engine to run advanced machine learning algorithms by running Python code from Jupyter. And you can use all of the features of Jupyter to easily annotate your code so that you can explain it to and share it with colleagues, managers, and customers. Not to mention, when you come back to look at your project in four months when you have forgotten most of what you did (in my case, in just four hours), your annotations will explain your code to yourself!
FOR MORE INFO