BookmarkSubscribeRSS Feed

SAS Viya: Using a SAS Quality Knowledge (QKB) Reference Table in a Custom Step

Started ‎07-24-2023 by
Modified ‎07-26-2023 by
Views 496

In my previous post, I reviewed how to use the Make control read-only and hide the control at runtime options for referencing table information to populate a drop-down list selector in a Custom Step. In this post, I will cover how to use those new options with the new SAS Quality Knowledge Base (QKB) reference tables in a Custom Step, so you can include the QKB locale and definition selections in your Custom Step.

 

  As of the 2023.06 release, there is a new SAS library called SASDQREF.  It contains data quality reference tables; the majority of which contain the metadata from the SAS Quality Knowledge Base (QKB).  The tables are located on the SAS Server at: /opt/sas/viya/home/commonfiles/home/share/DMData.

 

mk_1_DQREF.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

  • CONTACTS – listing of contact records that can be used to demonstrate data quality functions.  This is the same data as the CONTACTS table delivered with Data Management Studio.
  • DM_CADEF - listing of available casing definitions in the QKB
  • DM_DEF - listing of all definitions in the QKB
  • DM_EXDEF - listing of extraction definitions in the QKB
  • DM_GNDEF - listing of gender analysis definitions in the QKB
  • DM_IDDEF - listing of identification analysis definitions in the QKB
  • DM_LGDEF - listing of locale guessing definitions in the QKB
  • DM_LOC - listing of all locales in the QKB
  • DM_MADEF - listing of all match code definitions in the QKB
  • DM_MKDEF - listing of all data masking definitions in the QKB.  This is a subset of the standardization definitions in the QKB and are used to anonymize and mask data.
  • DM_PADEF - listing of all parsing definitions in the QKB
  • DM_PTDEF - listing of all pattern definitions in the QKB
  • DM_STDEF - listing of all standardization definitions in the QKB

 

To demonstrate how these tables can be used I am going to review how I can now update my Anonymize and Mask Data Custom Step from a prior post.  (Note:  The user interface for creating custom steps has had improvements since I wrote that post.) 

 

In that post when I built that version of the Anonymize and Mask Data Custom Step, I manually entered the data masking standardization definitions in the drop-down list control and hard-coded QKB locale to ENUSA (English – United States) in the Program (Template) code.  In this post, I will show how I can use the DM_MKDEF table that lists all the data masking definitions in the QKB to populate the drop-down lists for the data masking definitions as well as one for the QKB locale. 

 

Below is an excerpt from the table.  As part of my custom step, I will be referencing the columns highlighted in yellow.

 

mk_2_DQREF.png

 

  To start my revised Anonymize and Mask Data Custom Step, I add the controls for an input and output table.

 

mk_3_DQREF.png

 

Next, I add another input table control.  This one I use to hard-code its table to the DM_MKDEF table in the SASDQREF library.  I set its properties to read-only and hide control at runtime so the end user of the custom step will not see this control.

 

mk_4_DQREF.png

 

I add a column selector control and hard-code it to the Locale description column (DM_DESC) on the DM_MKDEF table. I set its properties to read-only and hide control at runtime so the end user of the custom step will not see this control.

 

mk_5_DQREF.png

 

I add a column selector control and hard-code it to the data masking definition column (DM_NAME) on the DM_MKDEF table. I set its properties to read-only and hide control at runtime so the end user of the custom step will not see this control.

 

mk_6_DQREF.png

 

Next, I add a column selector control to allow the user to select the column to mask from the input table they select.

 

mk_7_DQREF.png

 

I add a drop-down control for the QKB locale selection and make it a dynamic list based on the Locale column selector I added previously.

 

mk_8_DQREF.png

 

I add a drop-down control for the data masking definition selection and make it a dynamic list based on the Masking definition column selector I added previously.

 

mk_9_DQREF.png

 

Next, I create a prompt hierarchy between the two drop-down fields.  The available masking definitions to select are filtered by the QKB locale selection.  For more information on prompt hierarchies, refer to this post.

 

mk_10_DQREF.png

 

I had a new column control for the new masked column and give it a default name of maskedColumn; however, the end user can change that name if desired.

 

mk_11_DQREF.png

 

For the last part of my custom step design, I drop the column to be masked from the output port and add the new masked column.

 

mk_12_DQREF.png

 

To complete my custom step, I add the code to be executed on the Program tab.  As part of the code, I return the five-character QKB Locale that is associated the description of the locale that the end user selected.  Then, that locale is loaded and the dqStandardize function is used to apply the selected masking definition for that locale to return the masked column.

 


/* Create SASDQREF library in case it doesn't exist in the context where this custom step is being executed */
%macro checkSASDQREF ;

   %if %sysfunc(libref(SASDQREF)) %then %do ;
      %if %sysfunc(libname(SASDQREF,!sasroot/../commonfiles/home/share/DMData)) %then %put %sysfunc(sysmsg()) ;                                                                                                               
      %else %put NOTE: SASDQREF library assigned to %sysfunc(pathname(SASDQREF)). ;   
   %end ;
   %else %put NOTE: SASDQREF library exists. ;

%mend checkSASDQREF ;
%checkSASDQREF ;


/* Resolve selectedLocale to its 5-character QKBLocale */
proc sql ;
	select distinct DM_LOCALE into :QKBLocale
		from SASDQREF.DM_MKDEF
		where DM_DESC="&selectedLocale";
run ;
%put &=QKBLocale.;



/* Load Selected QKB Locale */
%DQLOAD(DQLOCALE=(&QKBlocale));
 

/* Apply Data Masking Standardization Definition */
data &outTable (drop=&incolumn_1_name); /* also specify the drop column in the output table metadata */
	set &inTable;
    /* also specify the new column name in the output table metadata */
	&maskedColumn_name=dqStandardize(&inColumn_1_name,"&selectedDefinition", "&QKBlocale");
run;

 

Now I can test my revised Anonymize and Mask Data custom step.  I select an input table and select to mask the CONTACT column.  I also select to use the English (United States) QKB locale and its Mask All Characters Except First and Last masking definition.  The maskedColumn  I rename to CONTACTmasked on the output table.

 

mk_13_DQREF.png

 

Note:  The controls that I hard-coded are hidden from the end user since I selected to hide those controls at runtime.

   

Summary

 

As of the 2023.06 release, there is a new SAS library called SASDQREF.  It contains data quality reference tables; the majority of which contain the metadata from the SAS Quality Knowledge Base (QKB).  These can help you build custom steps where you want to use definitions from the QKB.

 

The revised Anonymize and Mask Data custom step that I built as part of this post can be downloaded from here.     

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎07-26-2023 12:21 PM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags