BookmarkSubscribeRSS Feed

Manage Azure Access Key with AZUREAUTHCACHELOC= (Part -1)

Started ‎12-16-2020 by
Modified ‎12-16-2020 by
Views 4,623

With SAS Viya 3.5 release, SAS Viya users can access Azure ADLS2 Blob Storage to read and write data file types CSV, ORC, and Parquet in Viya 4. The access to ADLS2 Blob Storage is supported by using ADLS CASLIB. While working with ADLS CASLIB, one of the challenges is managing the Azure Access Key file (.json file) in the CAS environment. It is a tedious task for a Multi-Node CAS environment. It is even more challenging in Viya 4 with a Multi-Node CAS environment running on POD/Container at K8S services.

 

This blog post talks about how to manage the Azure Access Key file for ADLS CASLIB. The blog post is in two-part, the first part talks about the overall issue and resolution in Viya 3.5, and second part extends to Viya 4.  

 

CAS Supported data file types at Azure ADLS2 Blob Storage:

  • CSV and ORC data files (with Viya 3.5)
  • Parquet data files (with Viya 4)

 

With the default CAS setting, very first-time access of ADLS CASLIB generates an error message with a device-code, and Microsoft link to authenticate the user access. The device code registration process authorizes the user and generates an Azure Access key file (.json file) under the ‘cas’ user home directory

 

Error msg from Very first-time access of ADLS CASLIB :

 

NOTE: Executing action 'table.fileInfo'.
ERROR: To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code AMUPDJ8N7 to
       authenticate. 
ERROR: Pending end-user authorization.
ERROR: The action stopped due to errors.
NOTE: Action 'table.fileInfo' used (Total process time):

 

The device code displayed in the SAS log is from the CAS controller server only and not from all CAS worker nodes. If you register the device using SAS log information, it will generate the Azure Access Key file (.json) at the CAS controller server only. With the Access Key file at CAS controller, the data load for the ORC data file succeeds because CAS is using the serial method to load. Whereas, the data load for CSV/Parquet data file fails with the error msg “Pending end-user authorization” and no device code in the log. The data load for CSV/Parquet data file fails because CAS is using the Parallel method to load  

 

Error msg from very first-time access of ADLS CASLIB for CSV/Parquet data file load :

 

NOTE:       bytes moved             114.62K
NOTE: The Cloud Analytic Services server processed the request in 0.479026 seconds.
85     load casdata="sas_orsales.csv" casout="sas_orsales_CSV" replace ;
NOTE: Executing action 'table.loadTable'.
ERROR: Pending end-user authorization. 
ERROR: The action stopped due to errors.
NOTE: Action 'table.loadTable' used (Total process time):

 

To load CAS in parallel from ADLS2 Blob Storage, it requires an Azure Access Key file at each CAS worker node. To generate the Access Key file at each CAS worker node, it requires you to log in to each CAS worker nodes one by one and view the .json file to register the device code. It is a very tedious task when you have CAS with an n-number of worker nodes. It's even more difficult with Viya 4.0 running on K8S services.  

 

Instead of generating the Azure Access Key file at each CAS worker node, you can use cas.AZUREAUTHCACHELOC= system parameter to share the Key file from a central location. This parameter enables you to assign the file location to store and access the Azure Access Key file. You can mount a secured shared network drive to each CAS worker node with access for user cas. Assign the same network drive location to cas.AZUREAUTHCACHELOC= parameter. Very first-time access of ADLS CASLIB will generate the Azure Access Key file at the central location after registering the device code. Next time access to ADLS CASLIB will share the same central Key file to each CAS worker node.  

 

The following diagram describe the usage of cas.AZUREAUTHCACHELOC= parameter to store Azure Access Key file at a central location and shared by CAS controller and CAS Worker nodes.  

 

ut_ADLS_CASLIB_and_AzureAccess_Key_1.png

 

 

Update AZUREAUTHCACHELOC= system parameter in Viya 3.5

 

Mount a shared network drive / Azure File Share to each CAS Nodes

 

The following example describes an Azure File Share mount to each CAS Worker nodes. The statement to mount the Azure File Share is available at the “connect VM” tab. It provides detailed syntax for each OS environment. You can execute the statement for the respective OS, and a new file system will be available to CAS servers.

 

An Azure File Share and “connect VM” option.

 

uk_ADLS_CASLIB_and_AzureAccess_Key_2.png

 

When clicked on the “connect VM” tab, it pops-up with the mount statement. You can execute the respective statement at each CAS node manually or using an ansible tool.  

 

uk_ADLS_CASLIB_and_AzureAccess_Key_3.png

 

 

An example entry from the “/etc/fstab” at CAS server :

 

//utkumaviya4.file.core.windows.net/aksshare /mnt/aksshare cifs nofail,vers=3.0,credentials=/etc/smbcredentials/utkumaviya4.cred,dir_mode=0777,file_mode=0777,serverino

 

Apply cas.AZUREAUTHCACHELOC= parameter

 

Once Shared Network drive or File Share munted to the CAS servers, you can configure the AZUREAUTHCACHELOC parameter. The "cas.AZUREAUTHCACHELOC=" is assigned by adding an entry in the "casconfig_usermod.lua" file. You can also use vars.yml to assign the CAS system variable as a deployment step. The following example describes the AZUREAUTHCACHELOC= assignment with an Azure File Share mounted to each CAS node.  

 

An example entry from casconfig_usermod.lua :

 

cas.AZUREAUTHCACHELOC="/mnt/aksshare/AzureAccessKeys/cas/"

 

 

The CAS server after a change is applied to AZUREAUTHCACHELOC parameter and restart.

 

uk_ADLS_CASLIB_and_AzureAccess_Key_4.png

 

 With the AZUREAUTHCACHELOC parameter set to a central location, the access to ADLS CASLIB reads the Azure Access Key file from a central location. The data load for the ADLS2 CSV data file will not fail as each CAS Worker node find the access key file at a central location.

 

%let MYSTRGACC="utkuma5adls2strg";
%let MYSTRGFS="fsutkuma5adls2strg";
%let MYTNTID="b1c14d5c-XXXXXXXX-xxxxxxxxxxxx" ;
%let MYAPPID="a2e7cfdc-XXXXXXXXXXXX-xxxxxxxxxxxx";


CAS mySession  SESSOPTS=(CASLIB=casuser TIMEOUT=99 LOCALE="en_US" metrics=true);

caslib ADLS2 datasource=(
      srctype="adls"
      accountname=&MYSTRGACC
      filesystem=&MYSTRGFS
      dnsSuffix=dfs.core.windows.net
      timeout=50000
      tenantid=&MYTNTID
      applicationId=&MYAPPID
   )
   path="/sample_data"
   subdirs;

proc casutil incaslib="ADLS2";
   list files ;
run;
quit;

/* CAS load from ADLS2 storage data file */
proc casutil  incaslib="ADLS2"  outcaslib="ADLS2";
  load casdata="sas_orsales.csv" casout="sas_orsales_CSV" replace ;
  list tables ;
run;
quit;

cas mysession terminate;

 

 

The Azure Access Key file (.json file) created at Azure File Share.  

 

uk_ADLS_CASLIB_and_AzureAccess_Key_5.png

 

 

The Azure Access Key File.

 

[viyadep@cas02 etc]$ cat /mnt/aksshare/AzureAccessKeys/cas/.sasadls_10002.json
{"refresh_token":"0.AAAAXEXXXXXXXXXXXXXXXXXXXX-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

 

Summary:

  • Azure Access Key file (.json file) can be generated once and saved at a central location to share with CAS nodes using cas.AZUREAUTHCACHELOC parameter.
  • In-case of CAS environment/POD/Container recycled/rebuild, use existing Azure Access Key at a central location.
  • Azure Access Key file contains refresh_token option, by which it will refresh the Access Key in-case expired after a certain time.
  • The Azure Access Key generated at the CAS server is not tied to the Physical Hardware Server. It’s tied to the Azure user id, Azure application id, and Azure Resources.
  • The Azure Access Key is valid as long as the password did not change or expired for Azure User who generated the key

    Stay tuned-in for second part, featuring managing Azure Access key in Viya 4.    

 

Important Links:

cas.AZUREAUTHCACHELOC

AZUREAUTHCACHELOC System Option

Azure Data Lake Storage Data Source    

Version history
Last update:
‎12-16-2020 10:23 AM
Updated by:
Contributors

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags