BookmarkSubscribeRSS Feed

Build your first Custom Transformation in SAS Viya for Learners

Started ‎05-10-2022 by
Modified ‎05-09-2022 by
Views 1,643

In a recent article in the Free Data Friday series, I demonstrated a simple approach to identifying and handling outliers in “classic” SAS using SAS OnDemand for Academics. There we used SAS Procedures (SGPlot and SQL) to deal with some unrealistically large numbers. In this article I will be using SAS Viya for Learners to show how we can do the same thing with the SAS Cloud Analytic Services Language (CASL) during the Data Preparation phase of a project by creating a Custom Data Transformation.

 

What is SAS Viya for Learners

 

SAS Viya for Learners is a free, cloud-based software offering  which allows educators and students to explore the entire analytics life cycle from data preparation through analysis and vizualization - see here for more information and getting started.

 

The method

 

We will be using the Bank Customers data set which can be found by selecting Prepare Data from the SAS Viya for Learners Main Menu as shown below:

 

Figure1.png

 

We need to use a Plan so choose New Plan:

 

Figure2.png

 

This takes us to a screen where we can choose which data set to use - choose the Bank Customers Data Set from the left hand panel and click OK:

 

Figure3.png

 

This returns us to the Prepare Data Screen with the Bank Customers file loaded ready to add Transforms to get our data ready for processing.

 

Figure4.png

 

There are a large number of pre-built transformations available from the Transforms menu, however, none of them facilitate outlier handling so we will need to create our own. To do this we can double-click the Code option under Custom Transforms. This adds a Custom Code transform to the plan where we can choose Data Step or CASL from the drop-down menu. We need to choose CASL from the drop down menu and some boilerplate code will automatic be added to the code window.

 

Figure5.png

 

This text can now be replaced with the code to carry out our outlier detection and handling. This can be seen below.

 

Figure6.png

 

The important points to note are as follows:

 

Line 2    - Outlier handling is part of the Data Preprocess Action Set.

 

Line 3    - The input library and table names are both held in variables (_dp_InputCaslib and _dp_InputTable). This allows you to change the input table name without changing the code.

 

Line 4    - The variable hh_income is the one to be analysed for outliers.

 

Line 5    - There are a number of different outlier detection methods available. For the sake of simplicity I have chosen the Udflimits method. This allows me to specify minimum and maximum values outside of which values will be treated as outliers.

 

Line 6    - Any observations with values found to be outliers will be removed from the file.

 

Line 9    - The minimum and maximum values described at Line 4 are specified.

 

Line 10    - Outlier information can be written to a separate file.

 

Line11   - As with the input library and table name the output library ad table names are held in variables.

 

Finally when we run the plan, the transform will be exexcuted and observations outside the specified minimum and maximum ranges will be removed from the output file.

 

This is only one example of what you can do with Custom Transforms in both data step and CASL code. The next time you need to use a process which isn't covered by one of the pre-built transforms why not try creating one of your own?

Version history
Last update:
‎05-09-2022 10:21 AM
Updated by:

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started

Article Tags