You may have heard of SAS Information Catalog. The cool data discovery and profiling tool in SAS Viya that provides agent-based metadata crawling with an easy search interface. It is accessible right from the ‘hamburger menu’ under ‘Discover Information Assets’.
In this juletip I will focus on the search functionality that is in many ways obvious but contains some hidden gems. All the examples here were tested with SAS Viya version Stable 2022.09. With SAS Viya monthly release cycle, there probably are even more features in your version if you are reading this later on! I promise to keep this one short, so you won’t be late for holidays 😊
If you have not logged in to SAS Information Catalog before, I’m sure you’ll appreciate the asset dashboard that provides an overview of ALL the assets that have been discovered in your SAS Viya estate:
At least I love being able to see right away what is available in my SAS Viya environment on one glance. The indicators allow drill-thru with a mouse click as well!
For a data nerd like myself a summary across all catalogued data is so cool to have! And from here you can drill-thru further on to see all the tables from a specific type category! Assuming that your SAS administrator has already created some data agents to crawl your data libraries you will have an easy way to search the gathered metadata for content with the free text search tool that SAS Information Catalog provides.
Temptingly it’s asking you… ‘What assets are you looking for?’. Go ahead… type in a search term! Note that since it’s an Information catalog you can search for any type of asset, not just your tables and datasets. For example, SAS Studio flows, analytical models, decision models, rule sets etc. are all included as searchable assets.
Free-text search is very simple, just type in any keyword. While it’s easy to browse and see if any of the assets was what I was searching for, it helps to edit the columns in ‘Manage Columns’ to show the information that you’re most comfortable working with. It’s a small icon easy to miss, but it looks like this:
I’m an old school file explorer -minded personality, so I always prefer to see at least the name, library, extension, asset type, size and rows for my files:
Just to show you an example, how well this worked out in my case - I got exactly the kind of output I was hoping for! (only few first lines are shown):
Sorted by size descending, this gives me a good overview of the largest datasets available corresponding with my chosen search term. Apart from data sizes, it’s interesting to see that I have both In-Memory tables and good old sas7bdat files in my catalog.
While this is good, sometimes you might want to do more specific search. In SAS Information Catalog this is called faceted search. Often you want to filter your search based on some attribute like file size or library and faceted search is built just for that purpose. Just clicking on the search box brings the basic facets on display:
There is however, a long list of available search facets and clicking on help will bring you to the SAS Information Catalog User Documentation. I’m giving here simple examples of some facets that I have found useful:
Data size is always of interest, for example to find tables over 20GB use (in bytes, sorry about the zeroes):
DatasetSize:>20000000000
If you need to know what tables contain less than 100 rows use:
RowCount: <100
If you need to find a range between numbers, this is the way:
RowCount: [100<200]
If you already know what specific library contains your assets, you can search for a specific library like ‘sashelp’:
Library.name: sashelp
Data privacy regulation makes it important to know which tables contain personal data:
Column.informationPrivacy: private
Tag ‘private’ helps you find all the tables that have been tagged as sensitive:
You might want to know where all your sas7bdat datasets are:
FileExtension: sas7bdat
If you have been away las week and want to catch up of the latest stuff, use:
DateModified: "Previous week"
To find all of my own creations I would use:
CreatedBy: "Jarno Lindqvist"
And if you only want to see a certain resource type, you can search for example all SAS Studio flows, use:
AssetType: dataflow
Again, check the SAS Information Catalog User Documentation for all the facets and more examples on what keywords can be used. When you drill down to a specific table from your search results, you sometimes get the following message:
This means that that the table has been updated after the latest analysis by a SAS Information Catalog agent. No need to sweat though, under the ‘Actions’ button you can run an on-demand analysis for that particular data:
Also note that there is a history for the analyses. By collection history from recurring runs SAS Information Catalog can build a trend on the data quality indicators. Under the Action button you of course have the options of continuing work with that data in other SAS Viya applications, for example entering Lineage view that provides an overview of relationships with other linked assets.
Not wanting to spoil anything but after finding your data, there are many other cool things that you can do with data using SAS Information Catalog. I’ll give one more tip though. Defining data ownership has value in any data governance endeavor. In SAS Information Catalog you can now easily do this by assigning responsible contacts for your data:
But I promised to keep this Juletip short(ish), so I’m saving further goodies for my next blog post. May your holidays be happy and your data always at the tip of your hand! 🎅