BookmarkSubscribeRSS Feed

SAS Viya: Set-based Data Monitoring Using PROC EEL in SAS Code and Custom Steps

Started ‎06-23-2023 by
Modified ‎07-06-2023 by
Views 317

In a previous blog, I reviewed using PROC EEL in SAS Studio powered by SAS Viya.  In this four-part blog series, I cover examples of the three data monitoring rule types (row, group, and set) and an example with multiple rules by using PROC EEL in both SAS code and Custom Steps in SAS Studio.

 

Set-based Monitoring Rule using PROC EEL

 

For this demonstration on using PROC EEL to mimic the behavior of a set-based business rule in DataFlux Business Rule Manager, I will use the example of checking the missing percentage for a column on an input table.  If that column as a missing percentage greater than 0%, then the rule is triggered.

 

Below is the Contact List with Email input table, I use for my rule check.

 

mk_1_1_SetInput.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Here is the PROC EEL code to perform the check if the missing percentage is greater than 0% for the Email column.

 

proc eel data=SASDM.contact_list_with_email out=SASDM.MISSING_CHECK (keep=Rule_Result Missing_Pct);
  submit ;

  pre:
  hidden real total_records
  total_records=0

  hidden real total_missing
  total_missing=0

  //Set Target Percentage
  real i
  i = 0.00


  main:
  total_records=total_records+1
  if email==''
	total_missing=total_missing+1
  return false

  post:
  string(6) Rule_Result
  real Missing_Pct
  Missing_Pct=round(((total_missing/total_records) * 100),2)
  
  //write log messages
  logmessage("Total number of records " & total_records & " and total missing is " & total_missing)
  logmessage("Missing percentage is: " & Missing_Pct)

 //Evaluate Set
 if Missing_Pct > i 
	Rule_Result='FAILED'
 else
    Rule_Result='PASSED'
 pushrow()


endsubmit ;
run ;

 

Make note of the following in the code:

 

  • keep option on the proc eel statement to specify the column names to keep in the output
  • return false statement in the main section to prevent the individual rows from the input from being returned in the output
  • pushrow() statement in the post section to return a summary row to the output

 

Here are the results when I run this code against the input table.

 

mk_2_1_SetResults.png

 

The email column is missing values in 5.45% of the input data set so it failed the check.

 

Set-based Monitoring Rule in a Custom Step

 

Instead of hard-coding the input table, the column to check for blanks, the missing percentage to trigger the rule, and the output table, I decided to create a custom step where the user can select that information.  For more details on creating custom steps refer to the following:

 

 

Creating the Custom Step

 

On the main page of my custom step, I add controls for the selection of the following:

 

  • Input table
  • Column to check for blanks
  • Threshold for percentage missing
  • Output table.

mk_3_1_SetCS.png

 

Below is the JSON for the Prompt UI design for this custom step.

 

{
	"showPageContentOnly": true,
	"pages": [
		{
			"id": "page1",
			"type": "page",
			"label": "Set Rule Check for Percent Missing from Column",
			"children": [
				{
					"id": "inTable",
					"type": "inputtable",
					"label": "Select the input table",
					"required": true,
					"placeholder": "",
					"visible": ""
				},
				{
					"id": "ruleCheckCol",
					"type": "columnselector",
					"label": "Select the column to check for missing values",
					"order": false,
					"columntype": "a",
					"max": 1,
					"min": 1,
					"visible": "",
					"table": "inTable"
				},
				{
					"id": "pctMissing",
					"type": "numstepper",
					"label": "Select the acceptable percent missing for the specified column in the input table",
					"required": true,
					"integer": false,
					"min": 0,
					"max": 100,
					"stepsize": 1
				},
				{
					"id": "outTable",
					"type": "outputtable",
					"label": "Select the output table",
					"required": true,
					"placeholder": "",
					"visible": ""
				}
			]
		},
		{
			"id": "page2",
			"type": "page",
			"label": "About",
			"children": [
				{
					"id": "aboutText",
					"type": "text",
					"text": "This custom step performs a set-based rule check to monitor whether or not a column in the input table exceeds a specified percentage for missing values.\n\n\nVersion 1.0 (12APR2023)",
					"visible": ""
				}
			]
		}
	],
	"values": {
		"inTable": {
			"library": "",
			"table": ""
		},
		"ruleCheckCol": [],
		"pctMissing": 0,
		"outTable": {
			"library": "",
			"table": ""
		}
	}
}

 

Here is the Program code to perform the row rule check using PROC EEL.  Note: It is similar to the code in my above examples but uses the functions VarEval and GetVar from the expression language to retrieve the macro variable values for the selected columns.

 

/* Performa a Set Rule for missing percentage */
proc eel data=&inTable out=&outTable (keep=Rule_Result Missing_PCT);
  submit ;

  pre:
  hidden real total_records
  total_records=0

  hidden real total_missing
  total_missing=0

  //Set Target Percentage
  real i
  i = getvar("pctMissing")


  main:
  total_records=total_records+1
  if VarEval(getvar("ruleCheckCol_1_name"))==''
	total_missing=total_missing+1
  return false


  post:
  string(6) Rule_Result
  real Missing_Pct
  Missing_Pct=round(((total_missing/total_records) * 100),2)
 
  //write log messages
  logmessage("Total number of records " & total_records & " and total missing is " & total_missing)
  logmessage("Missing percentage is: " & Missing_Pct)

 //Evaluate Set
 if Missing_Pct > i 
	Rule_Result='FAILED'
 else
    Rule_Result='PASSED'
 pushrow()

endsubmit ;
run ;

Running the Custom Step

 

When I run the custom step, I use the same input table as my example above and map the Email column as the field to monitor for blanks and set the percentage missing threshold to 0.

 

mk_4_1_SetRunCS.png

 

My results are the same – the rule is violated, and the missing percentage is 5.45 for the Email column.

 

mk_5_2_SetRunCS.png

 

Since I have set this as a custom step where the inputs for the set rule check are generic, I can easily change the field to check for missing values to Phone and acceptable missing percentage to 10.

 

mk_6_3_SetRunCS.png

 

In this case, the set of data also fails the check with 61.82% of the Phone records missing values.

 

mk_7_4_SetRunCS.png

 
Summary

 

In conclusion, you can use PROC EEL combined with Custom Steps to perform set-based rule monitoring of your data where the inputs for the rule check are not hard-coded into the program.  For more information on Expression Engine Language coding, please refer to the following:

 

 

Acknowledgements

Special thanks to:

  • Mickey Schauf for providing the inspiration to create this blog series
  • Andrew Shakinovsky for taking the time to let me talk through my ideas for this blog series enabling me to confirm that the concept was possible!

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎07-06-2023 10:33 AM
Updated by:
Contributors

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags