BookmarkSubscribeRSS Feed
Go
Quartz | Level 8 Go
Quartz | Level 8

 Hi,

 

Is SPDS is legacy ? if we have GRID environment , is it silly to opt for SPDS ? 

10 REPLIES 10
LinusH
Tourmaline | Level 20
Technically, there are use cases with this set up.
And there are sites that manages hugs amount of data in SPDS successfully.
That said, this is almost a political question. But from my PoV SAS haven't made any substantial updates for some time (even if experimenting with hdfs compability is interesting) - and is never mentioned throughout the conferences.
I guess you want to talk to tour SAS representatives and the for yourself decide how you value your requirements vs product lifecycle.
Data never sleeps
nayakig
Obsidian | Level 7
Can you please explain below line ? Is it a storage that we can save data in it ? I though this is a component.



"there are sites that manages hugs amount of data in SPDS successfully."


Kurt_Bremser
Super User

@nayakig wrote:
Can you please explain below line ? Is it a storage that we can save data in it ? I though this is a component.



"there are sites that manages hugs amount of data in SPDS successfully."




It is a SAS component that deals primarily with storage:

Scalable Performance Data Server

 

It is usually set up with its own pool of physically separate disks to enhance parallel I/O. In times of heavily virtualized SAN storage, it has lost much of its usefulness, at least IMO.

LinusH
Tourmaline | Level 20

I assume that you got confused by my hugs typo that should read "huge"...

Data never sleeps
AhmedAl_Attar
Ammonite | Level 13

Hi,

 

Comparing GRID environment vs. SPDS, is like comparing Apples & Oranges! You'll be much better off with both working together 🙂

 

- GRID environment (Be It SAS or Hadoop) typically provides

  - A shared, centrally managed analytic computing environment

  - High availability

  - Workload management to optimally process multiple applications and workloads to maximize overall throughput.

  - Flexibility to incrementally grow the computing infrastructure as the number of users and the size of data increase over time

  - The ability to do rolling maintenance and upgrades without any disruption to the user community

 

- Scalable Performance Data Server (SPD Server)

  - A client/server, multi-user data server designed to optimize storage and to speed the processing of large SAS data sets.

  - Parallelizes many of the SAS I/O functions such as WHERE processing and INDEX creation over multiple data partitions

  - Extends parallel capabilities to include GROUP BY processing and SQL passthru.

  - Requires an SMP machine and is designed to use all resources available on the machine to achieve maximum scalability.

  - The maximum benefit with SPD Server is gained when it is run on a machine with:
       multiple cpus
       multiple I/O channels
       multiple disks
       large amount of data to be partitioned

  - Provides a high performance data store of very large SAS data sets. Therefore, it is particularly suited as part of a data warehousing solution where the SAS system is being used to construct, manage and analyze enterprise-wide datamarts.

 

So to conclude:

Grid --> More Compute, Memory and local storage resources

While

SPDS --> Better way to handle and process LARGE SAS Tables stored on file systems

 

Hope this clarifies your understanding and perceptions of these two technologies and how they could complement each other rather than replace.

 

Ahmed

nayakig
Obsidian | Level 7
Wow thank you Ahmed, that's a very clear explanation, one last query ... SPDS perspective what size is considered big ? for example dataset with 5millons rows, does SPDS is useful with this size ? even more ?


AhmedAl_Attar
Ammonite | Level 13

Hi @nayakig,

I look at SAS Data sets sizes their file size (GB), rather than record count (Ks, Ms).

I would consider Big Data Set is any thing larger than 7GB, some cases, larger than 5GB. it's all depends on your storage and Network throughput.

 

If you have very wide data set with long textual columns, it does not need millions of records to accumulate to 5+ GB in storage, if you see what I mean. 

 

Ahmed

JuanS_OCS
Amethyst | Level 16

Hello @Go,

 

I subscribe to the comment provided by @AhmedAl_Attar. SPDS is a mature product, but not legacy at all, but actually very recommended on many environments.

 

One of my customers, actually, is using SPDE tables for managing big loads of data, on a GRID environment. However, this model is already very small, and it has been recently advised by a SAS top-notch employee, to move to SPDS. I think @AhmedAl_Attar explains quite well the reasoning behind.

 

So, your question is about differences between HA (High Availability) and Performance. That is why they are 2 different stories, hard to compare to each other.

LinusH
Tourmaline | Level 20
If you only look at table size (and not on other SPDS features such as security) 5M rows sounds small. You'll be fine off using Base/SPDE libraries.
Data never sleeps
nayakig
Obsidian | Level 7

@LinusH @AhmedAl_Attar

 

thank you very much, Now I have the clarity 🙂

suga badge.PNGThe SAS Users Group for Administrators (SUGA) is open to all SAS administrators and architects who install, update, manage or maintain a SAS deployment. 

Join SUGA 

Get Started with SAS Information Catalog in SAS Viya

SAS technical trainer Erin Winters shows you how to explore assets, create new data discovery agents, schedule data discovery agents, and much more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 2246 views
  • 9 likes
  • 6 in conversation