BookmarkSubscribeRSS Feed

What is New in the SAS Information Catalog

Started ‎04-21-2021 by
Modified ‎08-29-2021 by
Views 5,372

The latest stable release of SAS Viya (2020.1.4) added the following to the SAS Information Catalog:

  • Information Privacy
  • Time Period
  • Area Covered for Assets
  • Locale selection for Discovery Agents.

 

Information Privacy, Time, and Spatial Area for Assets

The overview of an information asset now includes information privacy, time period covered, and spatial area covered. These features require the SAS Information Governance license.

 

bt_1_410-Information-Privacy-Time-and-Spatial-Area-for-Assets-1024x646.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

 

 

Information Privacy

 

bt_2_420-Information-Privacy-Sensitive-Private-and-Candidate.png

 

This pop-up window contains a breakdown of the AUSCUST (Australian customers) into Sensitive, Private, and Candidate semantic classifications. You can see tokens representing Delivery AddressCityPostal CodeLatitude, and Longitude in the Candidate classification. If no private data is detected, none is displayed. The pop-up window also references the Quality Knowledge Base (QKB) locale that is used to classify the data, which we will discuss below.

 

 

Time Period Covered

 

 

bt_3_430-Time-Period.png

 

 

The Time Period Covered field displays the date range, for example, January 1, 2015 – December 31, 2015. The time period value is based on the dates that are found in the data. When I saved the data, I added a WHERE clause on the year 2015.

 

Area Covered

 

bt_4_440-Spatial-Area-for-Assets-1024x324.png

 

In the data asset MELBRE (Melbourne real estate transactions), the Area Covered field pop-up window displays the top spatial values found from various fields, such as suburb and region. These correspond to the top frequency distribution values. The Quality Knowledge Base (QKB) locale used, Australia (English), is also displayed.

 

 

bt_5_450-Spatial-Area-for-Assets-US.png

 

 

In a second example, the city Houston is displayed in the Area Covered field.

 

For the full description of these new attributes, see the Overview Tab in SAS Information Catalog: User’s Guide.

 

Locale for Discovery Agents

If you have licensed SAS Information Governance, discovery agents can analyze the names and content of columns in assets. You can now select the discovery locale whose country and language is appropriate for the assets that are discovered by the agent. The agent uses the discovery locale to perform identification analysis on the names and content of columns in assets. Select the discovery locale whose country and language is appropriate for the assets that are discovered by this agent. For example, if the assets contain names and addresses from United States in English, select the United States (English) locale.

 

 

bt_6_460-Locale-for-Discovery-Agents.png

 

If you know assets in your library are from China, you should select China (Chinese), if from Belgium, select Belgium (Dutch or French) and so on.

 

bt_7_480-Semantic-Type-Column-Details-1024x644.png

 

When the discovery is finished the Semantic Type indicates the most likely classification. These classifications are summarized on the Overview page.

 

At work behind the semantic type calculation sits the identification analysis (field content) or the field name analysis.

 

 

bt_8_480-Semantic-Type-Sensitive-Private-data-1024x500.png

 

 

Identification analysis or field content, is more precise. It analyzes sample data and comes up with a list of candidates for the classification.

 

For example, a phone number can be classified as Phone with a score of 9 and Credit card with a score of 3. Why credit card? Well, for the software, a number is a number and if it starts with 4 and has 16 digits, it might look like a VISA card.

 

At the end, the classifications are ranked by score: Phone 9, Credit card 3, etc. The top one will be chosen as a Semantic Type: Phone. You have to understand that the classification has some degree of confidence built in, it is not perfect, there might be false positives, but it gives you a good idea without much effort.

 

Field name looks simply at the column name. If it matches some keywords, such as phone or mobile or GSM, it will be classified as a Phone. The field name doesn’t peek inside the column data, so if you have account numbers in a Phone column, the software will display Phone.

 

Identification analysis (field content) is more precise, but not available for all country / language pairs (called locales). For more information and to understand what is available for your country / language, see Understanding Content Analysis in SAS Information Catalog: Administrator’s Guide. For a very detailed list of Definitions by Locale see SAS Quality Knowledge Base for Contact Information 32.

 

Conclusions

We looked at the new features available in the SAS Information Catalog, latest stable SAS Viya release (2020.1.4): information privacy, time period and area covered, then explained the role of the Quality Knowledge Base and the influence of locale selection for discovery agents.

Read more about

SAS Documentation

 

Acknowledgements

Thank you Mary Kathryn Queen, Vincent Rejany and Kumar Thangamuthu.

 

Thank you for your time reading this article. If you liked the post, give it a thumbs up. Please comment and tell us what you think about the new SAS Information Catalog.

 

Find more articles from SAS Global Enablement and Learning here.

Version history
Last update:
‎08-29-2021 09:17 PM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started