BookmarkSubscribeRSS Feed

SAS ACCESS: Universal SAS Methods to Access Just About Any Data, Anywhere

Started ‎09-18-2018 by
Modified ‎06-30-2020 by
Views 9,351

I often speak about SAS ACCESS to customers. One particular presentation, called SAS ACCESS: Universal SAS Methods to Access Just About Any Data, Anywhere, covered two universal SAS programming methods for reading and writing to/from many types of database tables: SQL pass-through and the SAS ACCESS LIBNAME method. To emphasize the commonality of this across database systems, I demonstrated these methods for several diverse data source and focused on the consistency of the SAS syntax that makes it easy for SAS users to apply these methods to any required database. General performance considerations and efficiency techniques were discussed.

 

SAS Access.PNG

 

 

Below I am including the Q&A for the presentation. Following the Q&A you will find attached slides and demonstration code that you can download as well.

 

Question: What is the maximum number of lines that the database can return to SAS? 

Answer: That will be determined by your hardware and operating system.

 

Q: I run SAS in a UNIX environment and want to read an .xlsx file from my local PC (example c:\temp). I wrote the libname statement but get an error saying I need to include the SERVER= option?

A: Because SAS resides on UNIX and the Excel workbook resides on Windows, you must use the SAS PC Files Server to connect. You must also use the SERVER= option to specify the name of the Windows machine where the .xlsx file is stored. You can use the IP address or the machine name.:

libname unxls pcfiles path="c:\temp\lwacc\mixed.xlsx"
                      server="L7621.na.sas.com"
                      port=9621;

* The default port is 9621, so the PORT= option is necessary only when you use a different port number.

 

Q: Can you run and return an Oracle Explain plan using execute ( )?

A:  Yes, you can use the SQL Pass-Through Facility and the EXECUTE statement to use native Oracle non-SELECT SQL statements such as the EXPLAIN PLAN statement:

proc sql;
  connect to oracle (path=localhost user=student pw=Metadata0);
    execute (explain plan for
             select last_name from employees)
    by oracle;
  disconnect from oracle;
quit;

 

Q: Can you use SAS Access to connect to data in Amazon S3?

A: Amazon S3 is a service for storing and retrieving data. You can use SAS Access to Amazon Red Shift to query S3 data. Amazon Red Shift provides the application layer, including an SQL language that SAS can communicate with to access the S3 data.

 

Q: How can I tell what SAS Access licenses I have?

A: You can execute a PROC SETINIT with the NOALIAS option to find out what is licensed. To determine which of the available licensed SAS components you have installed you should also run PROC PRODUCT_STATUS. The results will show in the SAS log.

 

Q: Is SQL Pass-through or the LIBNAME method most efficient for moving data from the database to SAS?

A: Both methods use the same connection and communication protocol. So for moving the equivalent amount of data both are equal in performance. It may take the LIBNAME a little longer initially to determine how to generate the native SQL code but this amount of time becomes negligible when compared to the time it takes to move a significant amount of data from the database back to SAS.

 

Q: Is there a way to avoid using database passwords in the code? This is a security risk we need to avoid.

A: There are various ways to do this, depending on your setup. This can involve storing the library connection information in the SAS metadata repository. This connection information is available to the SAS server in a secure way without it being visible to the SAS user. Another way to set this is up can be to design the application to prompt the user to type the userid and/or password in a prompt window. This works for interactive applications only. Other methods can also be employed.

 

Q: When I query Hive tables from Hadoop and create SAS datasets the datasets the file size is really large even for small amounts of data and the character variables are larger than they need to be. Is this preventable?

A: This is a know issue when returning STRING variables from Hive back to SAS. The STRING data type in Hive has no length information and SAS defaults to creating character variables with a length of 32767 bytes in the resulting SAS Dataset. There are several ways to prevent this. Either by converting the Hive variables to a CHAR or a VARCHAR type, or by using SAS functions to reduce the length of the CHARACTER variable. I can add links to the Q and A log that I post for this talk sometime early next week.

 

Q: What data types does SAS support with SAS/ACCESS?

A: SAS generally supports all the basic data types, and not the more complex ones. This is documented for each database in the SAS Access documentation and varies somewhat by database. But everything is converted by the ACCESS engine to the only two data types SAS supports (character and numeric). Database dates are converted to the numeric SAS date value and automatically assigned a default date format. Complex data types are not supported for the most part, but in some case may simply be converted to character values and treated as a literal character string.

 

Q: If you copy an indexed SAS dataset to a database, is there a way to re-create the index in the database table?

A: It does not happen automatically as part of the copy process has been my general experience and as far as I know this is generally true. What you can do though, is add some SQL pass-through to submit database code to add an index to the tables after you create them.

 

Q: How do I name columns from an excel spreadsheet in my SAS code. I've had trouble working with columns that have spaces in their name.

A: For the LIBNAME access method this depends depends on the system option called validvarname. If using value of V7, the access engine will put underscore characters in place of invalid characters such as spaces, so 'My Var' becomes My_Var. If validvarname is set to ANY then the oroginal name with spaces and other invalid characters are retained and instead you need to quote the varialbe name, as it is, and add the letter n after the quoted name (so 'My Var'n).

 

Q: What is the quickest best way to do a lookup for data from a small SAS dataset. I need to find the matching rows in a large database table and return some values for those rows in the database.

A: There are several ways to approach this. One is to simply copy the small dataset into the database if you have write access to the database, and perform an inner join in the database. Then you can return the result back to SAS. There is also an option you can add on the SAS ACCESS LIBNAME statement (multi_data_src=inclause). If you then do an inner join of the sas dataset with a database table that option causes SAS to first find the unique values for the key value in the SAS dataset and then generate a query to ONLY bring back the matching rows from the database (see documentation here). Even better, if the database is indexed for the key value for the lookup, then you can use a LIBNAME or dataset option in the join (dbindex=yes) which instructs the Access engine to use the index to retrieve each needed row from the database. Click here for documentation on dbindex.

 

Q: Which SAS procedures are converted into native database SQL when you use the SAS ACCESS LIBNAME method?

A: The table below lists the BASE SAS and SAS/STAT procedures that convert part of the processing (sorting/computing summary statistics) in database before returning results to SAS. The SAS/STAT  procedures only execute in-database with Teradata.

 

Base

SAS/STAT

FREQ

CORR

MEANS

CANCORR

RANK

DMDB

REPORT

DMINE

SORT

DMREG

SUMMARY

FACTOR

TABULATE

PRINCOMP

 

REG

 

SCORE

 

TIMESERIES

 

VARCLUS

Comments
Tom

Please post attached SAS program files using .txt as the extension so the forum will allow you to preview the files without downloading them.  (Or get the forum fixed to know that .sas files are just text files and allow them to be previewed without forcing the poster to change the file name).

Made that change, Tom. It's a .txt now. Thanks for the suggestion.

Contributors
Version history
Last update:
‎06-30-2020 01:39 PM
Updated by:

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Article Tags