BookmarkSubscribeRSS Feed
jarno
SAS Employee

In SAS Studio on SAS Viya there are many ready-made steps at your disposal. There are times however, when the existing steps don’t cover your needs. Starting from Viya 2021 there is a possibility to create custom steps. If you’re a seasoned DI Studio user, think of this as a new way to create user-written transforms. Steps that are user built to cover a specific need. They are a great way to bring more power and capabilities to SAS Studio Flows. There a great example of creating a new custom step here:

SAS Demo | Create a Custom Step in SAS Studio on SAS Viya 

 

I was thinking of a suitable example and figured that many times there is need for describing a table structure. In SAS there is the good old PROC CONTENTS that tells many things about a table. In Python Pandas library there is df.info() function that does similar things to a dataframe, look here:

Pandas DataFrame: info() function 

 

So I got the idea of creating a custom step that allows user to connect a table and run either PROC CONTENTS or Pandas df.info to get information about table structure. First I opened up the dialogue to create a new custom step:

01_custom_step.JPG

 

 

 

 

 

 

 

 

Selecting “Custom step quick start” opens the custom step designer. There are three different tabs:

  1. Designer allows to create a UI layout for the new step and define names and labels to UI components
  2. Prompt UI tab creates a .json in the background while you add components to UI
  3. Program tab is where you put the actual code that your custom step will run

I added to UI components to my designer page. Input table obviously allows you to connect a source table to your custom step. Note that it will not show in the actual step, when you use it. It only provides an input port to your custom step. The second component I added is the drop down list. That is also self-explanatory, you can add as many selections as needed and they will appear in custom steps options page when you use it.

 

I am adding two options to my drop-down list. First selection is “PROC_CONTENTS” that will call the code that runs the SAS PROC. Note that I have used an underscore, best to use proper variable names since these are the actual ones that are used in the code. My second selection is “Pandas_info”. This will run the Python code that run the df.info() function:

03_selectors.JPG

 

 

 

 

 

 

 

 

 

 

I’m happy with these two UI components, they’re all that I need to get my describe custom step done. Now I can take a look at the .json code that was created behind the scenes while I added my components. I omitted the start and ending but you can see the references to the two UI components that we created in the .json code:

03b_json.JPG

 

 

 

 

 

 

 

 

 

 

 

Next step is to create the actual program code that will match with the variables provided by the UI. In this example we only use 2 variables:

  • InputTable: this references the source table that is mapped to the input port
  • InputType: this references one of the two possible selections from the drop-down menu

The code below is simple but does all that we need for this example. It’s a macro called “describeData” that will choose either of the if-then code blocks and run the reference part of code.

%MACRO describeData;

		%IF &inputType EQ PROC_Contents %THEN %DO;
			proc contents data=&inputTable;
			run;
		%END;

		%IF &inputType EQ Pandas_info %THEN %DO;
			PROC PYTHON infile="/srv/nfs/kubedata/compute-landingzone/ssfjal/myscript.py";
			quit;		
		%END;
		
%MEND;
%describeData;
  • First IF statement runs when &inputType equals PROC_Contents and uses &inputTable as data.
  • Second IF statement run when &inputType equals Pandas_info and uses the same &inputTable variable.

 

This uses the PROC PYTHON procedure where you can place Python code between submit; and endsubmit; statements. I tried of course to inset my code here but got the error message that the submit block cannot be placed inside a SAS macro. So I resorted to the infile statement and placed the actual submit block inside myscript.py file that is accessible from a path in the filesystem. This worked exactly as I wanted. You can read more about this in the PROC PYTHON documentation:

PYTHON Procedure

 

Here’s the actual Python code submit block:

inputTable1 = SAS.symget('inputTable')
df=SAS.sd2df(inputTable1)
print(df.info(verbose=True))

I noticed that the macro variable is not directly visible to PROC PYTHON so I needed to define variable inputTable1 and use SAS.symget function to transfer it over to Python. The next line uses SAS.sd2df function to copy the SAS table into a Python dataframe. Then I call the Python df.info function to provide a description of the table.

 

Now that we have finished creating our new custom step it’s time to save it as “Describe Data” and give it a spin! I insert the custom step on the workflow canvas and use the good old class dataset as source:

04_flow.JPG

 

 

 

 

 

 

 

 

 

It’s very simple, just connect source table, select either PROC_Contents or Pandas_info and run!

 

The result form SAS PROC CONTENTS is exactly what you would expect. Here’s the output from the Pandas df.info() function:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 19 entries, 0 to 18
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Name    19 non-null     object 
 1   Sex     19 non-null     object 
 2   Age     19 non-null     float64
 3   Height  19 non-null     float64
 4   Weight  19 non-null     float64
dtypes: float64(3), object(2)
memory usage: 888.0+ bytes
None
>>>

While this simple example may be limited in its applicability, I hope it shows how easy it is to create a custom step in SAS Studio! And the best thing is, if you place your custom step in the public folder, other users in your Viya platform can use and run it and even develop it further! So simple that Santa can do it while preparing for the big day! 🎅

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Discussion stats
  • 0 replies
  • 442 views
  • 5 likes
  • 1 in conversation