Hi everyone,
I hope you're all doing well.
I wanted to discuss a recent suggestion from my boss regarding a potential change in our workflow. Currently, we typically execute SAS code on the server and save all related files, including code, datasets, and outputs, directly on that server.
However, the new proposal involves saving all of these files on an FTP server while still running the SAS code on our existing server. Despite consulting with both our SAS after-sale engineer and the IT department, we haven't received a definitive answer on whether this approach would be effective.
Therefore, I'm reaching out to the community for any insights or experiences you may have had with a similar setup. Specifically, I'm interested in understanding if running SAS code on the server while storing related files on an FTP server is a viable approach.
Your input on this matter would be greatly appreciated.
Thank you,
***************************************************************************
1、connection is ok between FTP and SAS server, I can use FileZilla to upload or download some files
2、Can I build the libraries direcly using libname xx "FTP path ....." ? I am not sure
I'm not sure this would work. What goes in the quotes should simply be a path. If you need to store the data on a different machine I think you'd need to look at using a shared drive (Windows) or a Network Filesystem (UNIX/Linux). In a mixed environment, the two OS types can be persuaded to use each other's drive share protocols.
I'm very open to being corrected...!
Hi @blueskyxyz
Not sure how technical your manager is, because his/her suggestion of separating data files from processing server goes against the technical architectures designs of the past 2 decades, and even future designs!
Here is how:
- Back in the early 2000s, there was a rise of the Data Warehouse Appliances (Netezza, ExaData, GreenPlum, Teradata): These servers provided a box of Disks, CPUs, RAM all in one place to minimize data movement across the Network, and get the data closer to the processing.
- In the 2010s: Hadoop/Spark & HDFS were on the rise , to promote cheaper distributed computing and distributed data replications to ensure the compute nodes always have access to portion of the data locally.
- AWS has introduced Amazon S3 Express One Zone, which is a high-performance, single-zone Amazon S3 storage class that is purpose-built to deliver consistent, single-digit millisecond data access for your most latency-sensitive applications.
All these design/architecture trends, were/are trying to get the data closer to the compute and not away from it!
Just my two cents,
Ahmed
If it were even possible to use a SAS library on top of FTP, the performance would be like watching the paint dry. On a very cold, wet day.
And we have had multiple discussions here where libraries defined on network shares also had a negative impact on performance.
You can use FTP to import data from a remote server, and create files there (e.g. reports through ODS).
What business or operational benefit do you hope to gain by copying your SAS content to a different server? This proposal adds a layer of complexity and another point of failure to your SAS processing. In any case there are better ways to handle data movement between servers than FTP.
Hello @blueskyxyz
"the new proposal involves saving all of these files on an FTP server while still running the SAS code on our existing server" is not a practical solution.
An appropriate approach would be to mount the remote location on to your SAS server and save files there.
As a reminder temp/work folder needs to be on a local storage AND accessing a remote network location for read/write would add to latency or slowness.
SAS got an S3 filename engine and a Proc S3 procedure for interaction with S3. No need to use some 3rd party client side ftp tool.
The first questions I'd be asking:
1. What business problem are you trying to solve?
2. What benefit do you wish a solution should provide (like: what's the desired cost reduction)?
3. What's your companies storage strategy?
4. How does your current SAS environment look like?
Given you're already in contact with SAS I believe that's where you need to continue having the discussion and get guidance from.
Hello @blueskyxyz
Please have a look at the following. They are relevant to your question.
The first one a interesting forum discussion.
The last one shows the use of Amazon cli for data transfer.
SAS and AWS S3 - Page 2 - SAS Support Communities
57091 - File transfers using the Amazon Simple Storage Service (Amazon S3) (sas.com)
Command Line Interface - AWS CLI - AWS (amazon.com)
Hi Guys, thanks for all your replies.🌻
Here is a post which shows how to download the data from AWS S3
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.