RaiDrive is an open-source software that can map various cloud storage services (such as AWS S3 bucket) as local drives , allowing sas users to read or write cloud files as if they were local files.

blueskyxyz · Posted 03-05-2024 09:51 AM

Hi everyone,

I hope you're all doing well.

I wanted to discuss a recent suggestion from my boss regarding a potential change in our workflow. Currently, we typically execute SAS code on the server and save all related files, including code, datasets, and outputs, directly on that server.

However, the new proposal involves saving all of these files on an FTP server while still running the SAS code on our existing server. Despite consulting with both our SAS after-sale engineer and the IT department, we haven't received a definitive answer on whether this approach would be effective.

Therefore, I'm reaching out to the community for any insights or experiences you may have had with a similar setup. Specifically, I'm interested in understanding if running SAS code on the server while storing related files on an FTP server is a viable approach.

Your input on this matter would be greatly appreciated.

Thank you,

***************************************************************************

1、connection is ok between FTP and SAS server, I can use FileZilla to upload or download some files

2、Can I build the libraries direcly using libname xx "FTP path ....." ? I am not sure

blueskyxyz · Posted 03-05-2024 09:54 AM

warning for this situation:
using libname xx "FTP path ....."

Nigel_Pain · Posted 03-05-2024 10:43 AM

I'm not sure this would work. What goes in the quotes should simply be a path. If you need to store the data on a different machine I think you'd need to look at using a shared drive (Windows) or a Network Filesystem (UNIX/Linux). In a mixed environment, the two OS types can be persuaded to use each other's drive share protocols.

I'm very open to being corrected...!

AhmedAl_Attar · Posted 03-05-2024 12:01 PM

Hi @blueskyxyz

Not sure how technical your manager is, because his/her suggestion of separating data files from processing server goes against the technical architectures designs of the past 2 decades, and even future designs!

Here is how:

- Back in the early 2000s, there was a rise of the Data Warehouse Appliances (Netezza, ExaData, GreenPlum, Teradata): These servers provided a box of Disks, CPUs, RAM all in one place to minimize data movement across the Network, and get the data closer to the processing.

- In the 2010s: Hadoop/Spark & HDFS were on the rise , to promote cheaper distributed computing and distributed data replications to ensure the compute nodes always have access to portion of the data locally.

- AWS has introduced Amazon S3 Express One Zone, which is a high-performance, single-zone Amazon S3 storage class that is purpose-built to deliver consistent, single-digit millisecond data access for your most latency-sensitive applications.

All these design/architecture trends, were/are trying to get the data closer to the compute and not away from it!

Just my two cents,

Ahmed

Kurt_Bremser · Posted 03-05-2024 01:57 PM

If it were even possible to use a SAS library on top of FTP, the performance would be like watching the paint dry. On a very cold, wet day.

And we have had multiple discussions here where libraries defined on network shares also had a negative impact on performance.

You can use FTP to import data from a remote server, and create files there (e.g. reports through ODS).

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

SASKiwi · Posted 03-05-2024 02:16 PM

What business or operational benefit do you hope to gain by copying your SAS content to a different server? This proposal adds a layer of complexity and another point of failure to your SAS processing. In any case there are better ways to handle data movement between servers than FTP.

Sajid01 · Posted 03-06-2024 12:17 PM

Hello @blueskyxyz
"the new proposal involves saving all of these files on an FTP server while still running the SAS code on our existing server" is not a practical solution.
An appropriate approach would be to mount the remote location on to your SAS server and save files there.

As a reminder temp/work folder needs to be on a local storage AND accessing a remote network location for read/write would add to latency or slowness.

Patrick · Posted 03-06-2024 09:40 PM

SAS got an S3 filename engine and a Proc S3 procedure for interaction with S3. No need to use some 3rd party client side ftp tool.

The first questions I'd be asking:

1. What business problem are you trying to solve?

2. What benefit do you wish a solution should provide (like: what's the desired cost reduction)?

3. What's your companies storage strategy?

4. How does your current SAS environment look like?

Given you're already in contact with SAS I believe that's where you need to continue having the discussion and get guidance from.

Sajid01 · Posted 03-07-2024 12:16 PM

Hello @blueskyxyz
Please have a look at the following. They are relevant to your question.
The first one a interesting forum discussion.
The last one shows the use of Amazon cli for data transfer.
SAS and AWS S3 - Page 2 - SAS Support Communities
57091 - File transfers using the Amazon Simple Storage Service (Amazon S3) (sas.com)
Command Line Interface - AWS CLI - AWS (amazon.com)

blueskyxyz · Posted 05-18-2024 11:17 AM

Hi Guys, thanks for all your replies.🌻

Here is a post which shows how to download the data from AWS S3

https://communities.sas.com/t5/SAS-Explore-Presentations/Using-SAS-to-Download-Files-From-Amazon-Web...

RaiDrive is an open-source software that can map various cloud storage services (such as AWS S3 bucket) as local drives , allowing sas users to read or write cloud files as if they were local files.

https://www.raidrive.com

Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

Re: Connection between FTP (S3) and SAS server.

RaiDrive is an open-source software that can map various cloud storage services (such as AWS S3 bucket) as local drives , allowing sas users to read or write cloud files as if they were local files.