BookmarkSubscribeRSS Feed

SAS Viya on Microsoft Azure Marketplace: Accessing your Data on ADLS Gen2

Started ‎10-25-2022 by
Modified ‎10-31-2022 by
Views 1,899

SAS Viya has been released in the Microsoft Azure Marketplace last month during SAS Explore virtual user conference. With the click of a button on a pay-as-you-go basis, one can start working on SAS Viya in minutes. Being able to get access so quickly to SAS Viya is a huge step forward.

 

Now that you have SAS Viya up and running on Microsoft Azure, where do you start? How do you get data in there?

 

In this post, we will review how to access data from Microsoft Azure Data Lake Storage Gen2 or simply ADLS2.

 

 

What is ADLS Gen2?

 

ADLS Gen2 is a low-cost object storage solution for the cloud, used for building enterprise data lakes on Azure. Microsoft customers use ADLS2 for storing massive amounts of structured/unstructured data.

 

You are using ADLS Gen2 when you create a Storage Account and check “Enable hierarchical namespace”:

xxx

nir_post_81_01_gen2.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

Then generally your data is organized across blob containers and sub-folders (Storage Browser view):

 

nir_post_81_02_storage_browser.png

 

 

What do you need from ADLS2 to setup a connection in SAS Viya?

 

You need:

 

  • An ADLS Gen 2 Storage Account where you have “Contributor” and “Storage Blob Data Contributor” roles
    • Name of the Storage Account
    • Name of the Blob container
    • Path to read data from

 

nir_post_81_03_sa_infos.png

 

 

  • An Azure Active Directory Application configured to access ADLS2
    • Tenant (or directory) ID
    • Application (or client) ID

nir_post_81_04_app_infos.png

 

More details on the ADLS2 configuration is available in this post.

 

 

How do you access ADLS2 from SAS Viya?

 

Here are the different ways to access ADLS2 from SAS Viya:

 

Capability Engine File Type Support Access Type
ADLS FILENAME engine SAS Compute Server Any file type supported by SAS Read/Write
ORC LIBNAME engine SAS Compute Server ORC Read/Write
ADLS Data Source (CASLIB) SAS Cloud Analytic Services (CAS) CSV ORC Parquet Read/Write

 

 

First time

 

The first time you will try to access an ADLS2 file, you will be asked to authenticate. You will see a message like:

 

ERROR: To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code EVXXXXXW4 to authenticate.

 

Open the URL in a separate browser tab, paste the code indicated in the SAS log, log in with your Azure account following the different steps. When you are successfully logged in, re-run the piece of SAS code that threw the authentication error. It should now run successfully.

 

 

Examples

 

Importing a CSV file from ADLS2 into a SAS Data Set:

 

%let TENANTID=xxxxx ;
%let APPID=xxxxx ;

options azuretenantid="&TENANTID" ;

filename contacts adls "data/contact_list.csv"
	accountname="svomasa"
	applicationid="&APPID"
	filesystem="mydata" ;

proc import file=contacts out=contacts
			dbms=csv replace ;
run ;

 

Read and create an ORC file from/to ADLS2:

 

options azuretenantid="&TENANTID" ;

libname orcadls orc "/data/orc"
	storage_account_name="svomasa"
	storage_application_id="&APPID"
	storage_file_system="mydata"
	;

data orcadls.extract_10years(drop=facilityage) ;
	set orcadls.megacorp (keep=facilitystate revenue facilityage where=(facilityage>=10)) ;
run ;

 

Load a Parquet file from ADLS2 to CAS:

 

cas mysession sessopts=(azuretenantid="&TENANTID") ;

caslib adls datasource=
   (
      srctype="adls",
      accountname="svomasa",
      filesystem="mydata",
      applicationid="&APPID",
      resource="https://storage.azure.com/", 
      dnssuffix="dfs.core.windows.net"
   ) path="data/parquet/userdata.parquet/" subdirs libref=adls ;

proc casutil ;
	load casdata="userdata1.parquet" casout="userdata1" ;
	list tables ;
quit ;

 

 

Next

 

The Parquet LIBNAME engine will soon be available and will enable reading and writing Parquet files from/to ADLS2 from a SAS Compute Server session.  

 

If you want to know more about SAS Viya on Microsoft Azure, check the documentation and the community forum.

 

Find more articles from SAS Global Enablement and Learning here.

Comments

Within our Azure subscription - The Azure Active Directory screen is hidden and i cant access it. What others ways can i either get the details of the AzAD. Does this guide assume that Oauth tokens have been configured already?

Version history
Last update:
‎10-31-2022 09:43 AM
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started