Do not try and bend the spoon, that's impossible. Instead, only try to realize the truth... there is no spoon. Then you'll see that it is not the spoon that bends, it is only yourself. ―Spoon Boy to Neo
When you import a local file from your PC into CAS, do you feel sometimes you are asked to bend the spoon, just by looking at it? Here are 6 simple ways to import a local file in Cloud Analytic Services (CAS) and make it available in SAS® Viya™.
Simply put, no data, no analytics. It all starts with importing your files in CAS. And many files are still, on someone's computer.
We will focus on a client side load: import a file stored on your PC and make it available in CAS. What is a client? A browser, a “SAS client” (SPRE or SAS 9.4M5), or your local storage.
The focus is not server side load: when the file is already on the CAS Controller, or on a location available to the CAS Controller or workers: mounted drive, Hadoop distribution, DNFS, etc. The next question is how?
Six ways to import a file into CAS (and maybe at least three to upload it): the focus this post will be on the cases marked in green. In the next posts, you will learn about file import with python and how to import local SAS datasets.
A bottleneck is the time to upload the file on the SAS platform, as it depends where you are and where the SAS servers are located, internet speed, etc..
A simple test shows PROC CAS and PROC CASUTIL performing much faster than PROC IMPORT (in my environment). The first two run in CAS, the latter in SPRE.
Assume you need to import a single file prdsale.csv in the table prdsale, the casuser CASLIB.
Visual Interface = SAS Data Explorer, SAS Environment Manager (Data tab), SAS Visual Analytics (Add data). There might be more interfaces. If I forgot one, please leave me a comment.
Pluses (+): easiest, most convenient and ubiquitous in Viya.
Attention points:
SAS Data Explorer example:
Pick the file from your PC.
...and off it goes into your CASLIB.
As an alternative: Firstly, load your files in the SAS Drive folders.
Secondly, import with (your favorite) Visual Interface = SAS Data Explorer, SAS Environment Manager (Data), SAS Visual Analytics (Add data).
For example: use SAS Environment Manager: (file stored in SAS Content e.g. SAS Content / Users / <user> / My Folder)
Attention points:
This is a two-step process:
Attention point: There is a 100 MB file upload limit.
Option 1: reuse the file uploaded in SAS Content / Users / <user> / My Folder
Option 2: upload to intviya01 > Home
You can upload the file somewhere else, for example, on the SAS Viya Service Layer Server in intviya01 > Home
Option 3: file transfer (FTP):
The file stored in the SAS Viya Service Layer Server in /home/user/ . The location is identical with intviya01 > Home in SAS Studio V.
Pluses (+): You can upload files bigger than CAS Management service’s maxFileUploadSize.
Attention points:
Q: Which upload method is slower:
A: Note that, if the file is sitting on your local hard drive, it takes no longer to upload the file via the web UIs than it does to copy the file to somewhere that SAS can see it and then run SAS code that loads the data into SAS and sends it across the wire to CAS. In fact, it’s probably faster in most cases.*
Q: Can you import only 4 GB through the browser? I read the CAS Management service option maxFileUploadSize is set to 4 Gb by default.
A: If the customer has frequent need to upload files larger than 4GB, changing the CAS Management service maxFileUploadSize is a good option. I’ve uploaded files as large as 20GB through the service and see no reason why you couldn’t go much larger.*
Q: Can you use all the available import options?
A: Certain import options are not available (see Summary)
*reviewed 2020 Feb 14. Thanks to David H.
When to code and not use the interface?
Remember: when you code, drop your target CAS table first, or specify replace or append in the load options.
Pluses (+):
* Drop in-memory CAS table;
proc casutil ;
droptable casdata="&gateuserid._CSV_prdsale" incaslib="casuser" quiet;
quit ;
* table.upload cas action;
proc cas;
upload result=r status=rc /
path="&csvdata"
casOut={
caslib="casuser",
label="CSV file from client",
lifetime=999,
name="&gateuserid._CSV_prdsale",
promote=TRUE,
replication=0
}
importOptions={fileType="CSV", vars={{name="ACTUAL", format="DOLLAR8.2"}}
}
;
quit;
For macro definitions, see the Code Wrapper at the end. The code imports the same file but from /home/user/ . The location is identical with intviya01 > Home in SAS Studio V.
The Log: the file "binary data" is handled by CAS.
NOTE: The table SBXBOT_CSV_PRDSALE has been created in caslib CASUSER(sbxbot)
from binary data uploaded to Cloud Analytic Services.
NOTE: Action 'table.upload' used (Total process time):
NOTE: real time 0.102572 seconds
NOTE: cpu time 0.117076 seconds (114.14%)
NOTE: total nodes 5 (20 cores)
NOTE: total memory 156.32G
NOTE: memory 36.41M (0.02%)
NOTE: PROCEDURE CAS used (Total process time):
real time 0.11 seconds
cpu time 0.00 seconds
Pluses (+):
* Drop in-memory CAS table;
proc casutil ; droptable casdata="&gateuserid._CSV_prdsale" incaslib="casuser" quiet; quit ;
* Proc casutil - file load .csv from client machine to CAS library;
proc casutil ;
load file="&csvdata"
outcaslib='casuser' casout="&gateuserid._CSV_prdsale" copies=0 promote; quit;
The Log:
NOTE: The table SBXBOT_CSV_PRDSALE has been created in caslib CASUSER(sbxbot) from binary data uploaded to Cloud Analytic Services.
NOTE: Action 'table.upload' used (Total process time):
NOTE: real time 0.104739 seconds
NOTE: cpu time 0.120821 seconds (115.35%)
NOTE: total nodes 5 (20 cores)
NOTE: total memory 156.32G
NOTE: memory 36.69M (0.02%)
One of my colleagues used to say: "there is a CAS Action for everything". Apparently, there is a SAS Studio task for everything as well...
Pluses (+): easy, generates the code for you.
Attention point:
New > Import data. Will choose the location from SAS Drive. SAS Content / Users / <user> / My Folder
The task generates PROC IMPORT / PROC CONTENTS code that runs in SPRE (the SAS engine in Viya).
And the log proves it:
NOTE: The INFILE statement is not supported with DATA step in Cloud Analytic Services.
NOTE: The INPUT statement is not supported with DATA step in Cloud Analytic Services.
NOTE: Could not execute DATA step code in Cloud Analytic Services. Running DATA step in the SAS client.
...
NOTE: The data set CASUSER.SBXBOT_CSV_PRDSALE has 1440 observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.34 seconds
cpu time 0.10 seconds
You just saw an example above. For your convenience, the code:
FILENAME REFFILE FILESRVC FOLDERPATH='/Users/sbxbot/My Folder' FILENAME='prdsale.csv';
PROC IMPORT DATAFILE=REFFILE
DBMS=CSV
OUT=CASUSER.SBXBOT_CSV_PRDSALE;
GETNAMES=YES;
RUN;
PROC CONTENTS DATA=CASUSER.SBXBOT_CSV_PRDSALE; RUN;
You read six ways to load your local files in CAS (and at least three ways to upload). Use the following ‘rules-of-the-thumb’:
1-2 Convenience and some options: use the visual interfaces
3 Speed and many options: use PROC CAS upload action
4 Speed and some options: use PROC CASUTIL
5-6 Less speed, medium options: use PROC IMPORT or the SAS Studio V task.
In a next post, you will learn more about:
Stay tuned for more stories. And please comment, share and help others.
I would recommend the following resources:
Stephen Foerster, Mary Kathryn Queen, Nicolas Robert, Uttam Kumar, David H.
The following code simulates a file being uploaded in /home/&gateuserid/
* Wrapper code;
CAS mySession SESSOPTS=( CASLIB="casuser" TIMEOUT=999 LOCALE="en_US" metrics=true);
%let gateuserid=&sysuserid ;
%put My Userid is: &gateuserid ;
options msglevel=i ;
caslib _all_ assign;
* Define the files loaded;
%let csvdata=/home/&gateuserid/prdsale.csv;
%let dsdata=/home/&gateuserid/prdsale.sas7bdat;
%let folder=/home/&gateuserid./;
* Upload the csv files in /home/&gateuserid. folder;
* or Simulate upload: save a csv file under user’s home directory;
proc export data=sashelp.prdsale
outfile="&csvdata" REPLACE dbms=dlm;
putnames=yes; delimiter=',';
run;
proc copy in=DMS out=indata;
select prdsale;
run; quit;
proc contents data=indata.prdsale;
run;
libname indata clear; libname DMS clear;
*The code from each proc goes in here;
* list files and in-memory tables;
proc casutil incaslib="casuser" ;
list files; list tables;
quit;
*When you’re finished, clean-up;
* Drop in-memory CAS table;
proc casutil ;
droptable casdata="&gateuserid._CSV_prdsale" incaslib="casuser" quiet;
droptable casdata="&gateuserid._DATA_prdsale" incaslib="casuser" quiet;
quit ;
CAS mySession TERMINATE;
Thank you for your time reading this post. Please comment and share your experience with the Local File Import in CAS and help others.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning and boost your career prospects.