BookmarkSubscribeRSS Feed

Why SQL Dictionary Tables are a Game Changer for Data Scientists: Q&A, Slides, and On-Demand Recording

Started ‎11-05-2024 by
Modified ‎11-05-2024 by
Views 1,831

Watch this Ask the Expert session to discover why dictionary tables are a true game-changer for data scientists and learn the number 1 rule for data specialists: know the data. 

 

Watch the Webinar

 

You will learn how to:

  • Use PROC SQL’s dictionary tables to explore and resolve metadata issues easily.
  • Reorder variables using metadata without manual entry.
  • Analyze datasets to find extreme values, combinations and specific columns efficiently.

 

The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.

 

Q&A

Answered

Why did you mention that it is better to use dictionary.column vs sashelp.vcolumn when performing the query? Efficiency is better?

Yes, certainly when you use dictionary tables, SAS is going to be much more efficient with proc SQL. It is going to use the WHERE clause optimization. It's going to quickly go and grab that filter that you specified and get you what you need versus using the SASHELP library, where SAS launches a discovery process. You saw it in terms of timing. And if you haven't already, I would suggest going back and reviewing slides 27 & 28 (screen captures below). This slides below show you the time taken with the dictionary table in proc SQL versus the dictionary table in the SAS help library. Certainly, efficiency time wise, the SAS help library is going to take longer because of the discovery process.

 

sqlGoddess_0-1730825753612.png

 

 

sqlGoddess_1-1730825753618.png

 

 

 

What does "dbp" stand for in the var dbp--id line?

That's one of the measures for diabetes. If you're familiar with the world of diabetes, DBP stands for diastolic blood pressure. Diastolic blood pressure (DBP) is the force of the blood against the walls of the arteries when the heart is relaxing (diastole). Normal resting DBP is less than 85 mm Hg. Diastolic blood pressure (DBP) is connected to diabetes in several ways, e.g. A 10 mm Hg higher DBP is associated with a 52% higher risk of developing diabetes.

 

Does dictionary.columns also apply to SAS Studio?

I use SAS Enterprise Guide, which is my SAS interface to do this demonstration. You could certainly use SAS Studio as your interface. The exact same dictionary tables code is still applicable. You're still going to be able to go to the SAS help library, you're still going to be able to pop open the dictionary tables, and all of my code will still work.

 

Instead of alphabetic order, can we order variables in a more customized order? i.e., Only need put ID as the first variable. the rest of variable keep the same order.

Yes, it's all possible. The minute you ask for more tailoring, you have to look at more detailed approaches. It may just need a little bit more creativity for you to be able to pick up ID, store it in a macro variable, and then append the others.

 

All libraries must be predefined to SAS, right?

It depends on your organization. I think what you're alluding to is some shops have predefined libraries by the administrators. So long as you have access to the folder in which the data is stored, you can create your own libraries. See blow my screen capture from my Enterprise Guide session so you can see my library assignment.

 

sqlGoddess_2-1730825753627.png

 

 

So, I have access to this folder. I created a macro variable called PATH with the percent LET statement that says PATH should point to this location. In the LIBNAME statement, I've created a library called DIABETES pointing to this location. Because I have access to that location, I'm able to build a library called DIABETES. It all boils down to, does everyone in your organization have access to specific folders? Do you specifically have access to where the data is stored or does the administrator pre assign these libraries? So that's a good question to check back with the admin.

 

Why did the PIMA population have diabetes?

The Pimas of Arizona adapted to their desert homeland by directing water through an elaborate system of irrigation canals to support subsistence agriculture; they grew corn, beans, squash and cotton. Around 1900 the increased population of white settlers to the north led to the eventual diversion of the water that supported the Pimas' way of life (the damming of the gila river). The forced curtailment of farming led to significant impacts on their food intake, physical activity and economy.  Their life changed from farming sustained though physical labor to one of food scarcity and little labor. Famine, rarely experienced in the past, became chronic. As a result, this time period marked the end of a lifestyle to which the Pima were well-adapted and transitioned to a lifestyle for which they were poorly adapted. Their low-fat, high-carbohydrate diet changed to one that derived more than 40% of its energy from fat. At the same time the demand for physical labor subsided. The timing of this significant change in lifestyle and livelihood coincides with the development of diabetes among the Arizona Pimas. Link here to the source article

 

Will the ID within length of 4 in Length 8 table got truncated?

If the ID columns are the same type, any attempt to join columns with varying widths will result in truncation of data.

 

Will the macro table contain variables, programs, and lists?

The Macro Dictionary Table contains macros variable (system and user-defined) names, scope and their values. See screenshot below

 

sqlGoddess_3-1730825753633.png

 

 

Is the Diabetes Lib available in SAS Studio?

The DIABETES library is created by me, a SAS user, by supplying the following code

 

%let path = C:\Users\cancxs\OneDrive - SAS\Home\SAS Edu\user groups\2024 UG\ATE Fall 24\data;

libname DIABETES "&path";

 

This is very informative. Is it offered also as a class, with a way to mimic and write the code to see it working?

This has been offered as a hands-on workshop at a SAS conference. If you wish for it to be offered at your location, please reach out to Alice.Yuan@sas.com.

 

Thank you for the wonderful presentation. Could you please explain the difference between what is in SAS Help and Dictionary.dictionarires?

Dictionary tables accessed via the SASHELP library vs PROC SQL, both provide access to system information, but differ in scope and use. In SAS, querying DICTIONARY tables directly within PROC SQL is faster than using the equivalent SASHELP views (like SASHELP.VCOLUMN or SASHELP.VTABLE). DICTIONARY tables are optimized for SQL and retrieve metadata directly, whereas SASHELP views are based on DICTIONARY tables but require additional processing outside of SQL, making them slower in comparison.

 

What is the reason we can use the Title PROC SQL OUTPUT - RECALL ALL DATE COLUMNS?

This might reference the following code and is due to the TITLE statement

 

title 'Proc SQL Output - Recall All Date Columns';

proc sql;

select memname, name, type, length from sashelp.vcolumn

where  upcase(name) like '%DATE%';

quit;

 

Can you use dictionary tables against DB2?

Database dictionary tables cannot be accessed directly via the SASHELP library. The SASHELP library only contains SAS-specific views and sample data that SAS provides with each installation.

To access database dictionary tables (such as those in databases like Oracle, SQL Server, or MySQL), you would typically connect to the database using SAS/ACCESS and then query the database’s dictionary tables directly through PROC SQL, using SQL pass-through or LIBNAME engines configured for the database.

 

Does the dictionary library hold all tables/columns from the active library?

Yes, SAS DICTIONARY tables contain metadata for all tables, columns, and other elements in active libraries within the current SAS session. For instance:

  • DICTIONARY.TABLES holds information on all tables in active libraries, including table names, library names, and creation dates.
  • DICTIONARY.COLUMNS includes details on all columns across tables in active libraries, such as column names, data types, and formats.

These tables are dynamically updated, so they reflect any changes in active libraries immediately during the session. This makes DICTIONARY tables highly useful for obtaining up-to-date metadata about the current SAS environment.

 

 

Recommended Resources

“Shankar, Charu. “Know Thy Data: Techniques For Data Exploration”. Pharmasug 2018

SAS Tutorial | Step-By-Step PROC SQL – SAS YouTube Video

Ask the Expert Webinar - Why Choose Between SAS Data Step & PROC SQL When You Can Have Both

PROC SQL INTO Clause

SAS 9.4 PROC SQL User’s Guide

The Power Of SAS SQL – SAS YouTube Video

Libeg, Linda. ”The SAS® Magical Dictionary Tour”.

Go, Imelda C. “Reordering Variables in a SAS® Data Set”

Droogendyk, Harry. “QCYour SAS ® and RDBMS Data Using Dictionary Tables”

Proc Sql Syntax Order: Go Home On Time With These 5 PROC SQL Tips. Shankar, Charu

Kuligowski, Andrew T. & Shankar, Charu. “Know thy data: Techniques for Data Exploration”

Eberhardt, Peter & Brill, Irene. “How Do I Look it Up If I Cannot Spell It: An Introduction to SAS® ...

Eberhardt, Peter, “A Hands-on Introduction to SAS® Dictionary Tables”

Lavery, Russ, “Cool SQL Tricks”

Please see additional resources in the attached slide deck.

 

Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.

Version history
Last update:
‎11-05-2024 01:52 PM
Updated by:

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Article Tags