Hi,
Is SPDS is legacy ? if we have GRID environment , is it silly to opt for SPDS ?
@nayakig wrote:
Can you please explain below line ? Is it a storage that we can save data in it ? I though this is a component.
"there are sites that manages hugs amount of data in SPDS successfully."
It is a SAS component that deals primarily with storage:
Scalable Performance Data Server
It is usually set up with its own pool of physically separate disks to enhance parallel I/O. In times of heavily virtualized SAN storage, it has lost much of its usefulness, at least IMO.
I assume that you got confused by my hugs typo that should read "huge"...
Hi,
Comparing GRID environment vs. SPDS, is like comparing Apples & Oranges! You'll be much better off with both working together 🙂
- GRID environment (Be It SAS or Hadoop) typically provides
- A shared, centrally managed analytic computing environment
- High availability
- Workload management to optimally process multiple applications and workloads to maximize overall throughput.
- Flexibility to incrementally grow the computing infrastructure as the number of users and the size of data increase over time
- The ability to do rolling maintenance and upgrades without any disruption to the user community
- Scalable Performance Data Server (SPD Server)
- A client/server, multi-user data server designed to optimize storage and to speed the processing of large SAS data sets.
- Parallelizes many of the SAS I/O functions such as WHERE processing and INDEX creation over multiple data partitions
- Extends parallel capabilities to include GROUP BY processing and SQL passthru.
- Requires an SMP machine and is designed to use all resources available on the machine to achieve maximum scalability.
- The maximum benefit with SPD Server is gained when it is run on a machine with:
multiple cpus
multiple I/O channels
multiple disks
large amount of data to be partitioned
- Provides a high performance data store of very large SAS data sets. Therefore, it is particularly suited as part of a data warehousing solution where the SAS system is being used to construct, manage and analyze enterprise-wide datamarts.
So to conclude:
Grid --> More Compute, Memory and local storage resources
While
SPDS --> Better way to handle and process LARGE SAS Tables stored on file systems
Hope this clarifies your understanding and perceptions of these two technologies and how they could complement each other rather than replace.
Ahmed
Hi @nayakig,
I look at SAS Data sets sizes their file size (GB), rather than record count (Ks, Ms).
I would consider Big Data Set is any thing larger than 7GB, some cases, larger than 5GB. it's all depends on your storage and Network throughput.
If you have very wide data set with long textual columns, it does not need millions of records to accumulate to 5+ GB in storage, if you see what I mean.
Ahmed
Hello @Go,
I subscribe to the comment provided by @AhmedAl_Attar. SPDS is a mature product, but not legacy at all, but actually very recommended on many environments.
One of my customers, actually, is using SPDE tables for managing big loads of data, on a GRID environment. However, this model is already very small, and it has been recently advised by a SAS top-notch employee, to move to SPDS. I think @AhmedAl_Attar explains quite well the reasoning behind.
So, your question is about differences between HA (High Availability) and Performance. That is why they are 2 different stories, hard to compare to each other.
The SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment.
SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.
Find more tutorials on the SAS Users YouTube channel.