BookmarkSubscribeRSS Feed

Build your first Custom Transformation in SAS Viya for Learners

Started ‎05-10-2022 by
Modified ‎05-09-2022 by
Views 865

In a recent article in the Free Data Friday series, I demonstrated a simple approach to identifying and handling outliers in “classic” SAS using SAS OnDemand for Academics. There we used SAS Procedures (SGPlot and SQL) to deal with some unrealistically large numbers. In this article I will be using SAS Viya for Learners to show how we can do the same thing with the SAS Cloud Analytic Services Language (CASL) during the Data Preparation phase of a project by creating a Custom Data Transformation.

 

What is SAS Viya for Learners

 

SAS Viya for Learners is a free, cloud-based software offering  which allows educators and students to explore the entire analytics life cycle from data preparation through analysis and vizualization - see here for more information and getting started.

 

The method

 

We will be using the Bank Customers data set which can be found by selecting Prepare Data from the SAS Viya for Learners Main Menu as shown below:

 

Figure1.png

 

We need to use a Plan so choose New Plan:

 

Figure2.png

 

This takes us to a screen where we can choose which data set to use - choose the Bank Customers Data Set from the left hand panel and click OK:

 

Figure3.png

 

This returns us to the Prepare Data Screen with the Bank Customers file loaded ready to add Transforms to get our data ready for processing.

 

Figure4.png

 

There are a large number of pre-built transformations available from the Transforms menu, however, none of them facilitate outlier handling so we will need to create our own. To do this we can double-click the Code option under Custom Transforms. This adds a Custom Code transform to the plan where we can choose Data Step or CASL from the drop-down menu. We need to choose CASL from the drop down menu and some boilerplate code will automatic be added to the code window.

 

Figure5.png

 

This text can now be replaced with the code to carry out our outlier detection and handling. This can be seen below.

 

Figure6.png

 

The important points to note are as follows:

 

Line 2    - Outlier handling is part of the Data Preprocess Action Set.

 

Line 3    - The input library and table names are both held in variables (_dp_InputCaslib and _dp_InputTable). This allows you to change the input table name without changing the code.

 

Line 4    - The variable hh_income is the one to be analysed for outliers.

 

Line 5    - There are a number of different outlier detection methods available. For the sake of simplicity I have chosen the Udflimits method. This allows me to specify minimum and maximum values outside of which values will be treated as outliers.

 

Line 6    - Any observations with values found to be outliers will be removed from the file.

 

Line 9    - The minimum and maximum values described at Line 4 are specified.

 

Line 10    - Outlier information can be written to a separate file.

 

Line11   - As with the input library and table name the output library ad table names are held in variables.

 

Finally when we run the plan, the transform will be exexcuted and observations outside the specified minimum and maximum ranges will be removed from the output file.

 

This is only one example of what you can do with Custom Transforms in both data step and CASL code. The next time you need to use a process which isn't covered by one of the pre-built transforms why not try creating one of your own?

Version history
Last update:
‎05-09-2022 10:21 AM
Updated by:

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags