Hello Folks, We at my org are using SAS Connect procedures(proc download and upload) to move a dataset from one server to another. I have been told that it takes ages and to see if there are other better ways to optimize the process. Any pointers/directions would be of great help.
And of course many thanks in advance.
You could probably just FTP (or SFTP) the data set from one server to another. If your FTP/SFTP/FTPS/SSH method uses compression, it will probably be faster than using SAS. You might get better results if you gzip before sending (or maybe not - you might lose more time zipping and unzipping than you save by sending fewer data packets).
It might be possible to do it entirely in SAS using the ZIP and sockets engines. That would make an interesting project, but I wouldn't want to have to support it for the next decade.
Two Other Methods can also be considered apart from FTP/SFTP:
1. Rsync (Remote Sync) : Commonly used utility in Unix/Linux to copy and synchronize files and folders. Consumes less bandwith as it uses compression and decompression method at both ends.
2. NDM ( Network Data Mover) : Needs licence. Good for large datasets and has several advantages like encryption ,compression, cli etc.
I would suggest you calibrate the transfer speed by just doing a simple transfer of a largish data volume between the two servers, using a simple option like FTP. If it takes about the same time as your SAS process, you just have a lot of data and/or slow communication speeds.
If the FTP is MUCH faster than SAS, there's something wrong in how your environment is set up. Some digging might result in a large improvement in your SAS environment, maybe carrying over to other users/applications.
Tom
Really depends on the topology of your network. If the two servers live near each other perhaps you can just have them both access the files from the same storage device and totally eliminate the need to move anything.
Thank you all for your responses. I will ask my Boss to take a look into this thread and will ask for his thoughts.
So here is the process they(we) have been working with. Does the below make much sense to you?
signoff bpthost;
signoff sashost;
options compress=yes obs = max validvarname = any;
%let sashost=blahblahblah;
%let usr = %str(xxxx);
%let pwd = %str(xxxx);
signon sashost username="&usr" password="&pwd";
rsubmit sashost;
options compress=yes obs = max validvarname = any;
LIBNAME TDRRO META LIBRARY="SPDS - TDR ReadOnly" METAOUT=DATA;
endrsubmit;
options symbolgen;
rsubmit sashost;
proc download data = TDRRO.tdr_database; run;
endrsubmit;
signoff sashost;
/*** data period is now in your local work folder ***/
%let usr = %str(xxxx);
%let pwd = %str(xxxx);
/*** signon to the bpt server ***/
options comamid=tcp;
filename RLINK "\\wapprib00001040\sas_install\SAS_Config\tcpwin.scr";
%let APPSVR=blahblahblah1;
%let bpthost=blahblahblah2;
/*%let riskhost=blahblahblah3;*/
%let NOOVLPIO=YES;
/* SIGNON TO BPT SERVER */
options FORMCHAR='|----|+|---+=|-/\<>*' ;
options remote=bpthost ps=60 msglevel=i errors=1 obs=max compress=no replace=yes no$syntaxcheck nosymbolgen validvarname=any;
signon bpthost username="&usr." password="&pwd.";
libname bptwork server=bpthost slibref=work;
rsubmit bpthost;
proc upload data = tdr_database; run;
libname tdrrofl "\\corp\sites\RIB1001\HLSCreditRisk\SAShostData\tdrro";
data tdrrofl.tdr_database;
set work.tdr_database;
run;
proc datasets lib=work noprint;
delete
tdr_database
;
quit;
run;
endrsubmit;
Thank you Sir @Tom Kindly bear with me and with my silly questions as I have never personally used SAS connect and I am not clear how it works. So you are right. with that common sense approach, can the code be as simple as
1. TDRRO to work(proc download)
2. WORK to TDRROFL(proc upload)
?
rsubmit sashost;
options compress=yes ;
proc download data = TDRRO.tdr_database; run;
endrsubmit;
rsubmit bpthost;
libname tdrrofl "\\corp\sites\RIB1001\HLSCreditRisk\SAShostData\tdrro";
proc upload data = tdr_database out=tdrrofl.tdr_database; run;
proc datasets lib=work noprint;
delete
tdr_database
;
quit;
run;
endrsubmit;
You may say "why don't I test it?" I do not have all my accesses set up yet as I am new to the team so in the mean time , i just reviewing the stuff in existence
Again it is easier to explain without the details from your case (we don't know your data/hosts/etc.).
So you are running on a node let's call MASTER as that is where you are actually submitting the code.
You want to connect to another SAS server that has the data to be moved. Let's call that node SOURCE.
You want to move the data to a third SAS server let's call TARGET.
On SOURCE the data lives in a library we are accessing with a libref name FROM. On TARGET the is written into a library we are accessing with a libref named TO.
signon source;
rsubmit source;
libname from ..... ;
signon target;
rsubmit target ;
libname to .... ;
proc upload in=FROM.table1 out=TO.table1;
run;
endrsubmit;
endrsubmit;
Looks slick and neat. Let me get my access, test and come back here. Of course with smile, I believe you nailed it. However, I will keep the thread open until mon/tue and mark the solution as answered just giving time if some clarifications may be needed. Thanks a tonne!
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.