SAS Viya has been released in the Microsoft Azure Marketplace last month during SAS Explore virtual user conference. With the click of a button on a pay-as-you-go basis, one can start working on SAS Viya in minutes. Being able to get access so quickly to SAS Viya is a huge step forward.
Now that you have SAS Viya up and running on Microsoft Azure, where do you start? How do you get data in there?
In this post, we will review how to access data from Microsoft Azure Data Lake Storage Gen2 or simply ADLS2.
ADLS Gen2 is a low-cost object storage solution for the cloud, used for building enterprise data lakes on Azure. Microsoft customers use ADLS2 for storing massive amounts of structured/unstructured data.
You are using ADLS Gen2 when you create a Storage Account and check “Enable hierarchical namespace”:
xxx
Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.
Then generally your data is organized across blob containers and sub-folders (Storage Browser view):
You need:
More details on the ADLS2 configuration is available in this post.
Here are the different ways to access ADLS2 from SAS Viya:
Capability | Engine | File Type Support | Access Type |
---|---|---|---|
ADLS FILENAME engine | SAS Compute Server | Any file type supported by SAS | Read/Write |
ORC LIBNAME engine | SAS Compute Server | ORC | Read/Write |
ADLS Data Source (CASLIB) | SAS Cloud Analytic Services (CAS) | CSV ORC Parquet | Read/Write |
The first time you will try to access an ADLS2 file, you will be asked to authenticate. You will see a message like:
ERROR: To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code EVXXXXXW4 to authenticate.
Open the URL in a separate browser tab, paste the code indicated in the SAS log, log in with your Azure account following the different steps. When you are successfully logged in, re-run the piece of SAS code that threw the authentication error. It should now run successfully.
Importing a CSV file from ADLS2 into a SAS Data Set:
%let TENANTID=xxxxx ;
%let APPID=xxxxx ;
options azuretenantid="&TENANTID" ;
filename contacts adls "data/contact_list.csv"
accountname="svomasa"
applicationid="&APPID"
filesystem="mydata" ;
proc import file=contacts out=contacts
dbms=csv replace ;
run ;
Read and create an ORC file from/to ADLS2:
options azuretenantid="&TENANTID" ;
libname orcadls orc "/data/orc"
storage_account_name="svomasa"
storage_application_id="&APPID"
storage_file_system="mydata"
;
data orcadls.extract_10years(drop=facilityage) ;
set orcadls.megacorp (keep=facilitystate revenue facilityage where=(facilityage>=10)) ;
run ;
Load a Parquet file from ADLS2 to CAS:
cas mysession sessopts=(azuretenantid="&TENANTID") ;
caslib adls datasource=
(
srctype="adls",
accountname="svomasa",
filesystem="mydata",
applicationid="&APPID",
resource="https://storage.azure.com/",
dnssuffix="dfs.core.windows.net"
) path="data/parquet/userdata.parquet/" subdirs libref=adls ;
proc casutil ;
load casdata="userdata1.parquet" casout="userdata1" ;
list tables ;
quit ;
The Parquet LIBNAME engine will soon be available and will enable reading and writing Parquet files from/to ADLS2 from a SAS Compute Server session.
If you want to know more about SAS Viya on Microsoft Azure, check the documentation and the community forum.
Find more articles from SAS Global Enablement and Learning here.
Within our Azure subscription - The Azure Active Directory screen is hidden and i cant access it. What others ways can i either get the details of the AzAD. Does this guide assume that Oauth tokens have been configured already?
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.