Architecting, installing and maintaining your SAS environment

SPDS basic question

Reply
Contributor Go
Contributor
Posts: 74

SPDS basic question

 Hi,

 

Is SPDS is legacy ? if we have GRID environment , is it silly to opt for SPDS ? 

Super User
Posts: 5,852

Re: SPDS basic question

Technically, there are use cases with this set up.
And there are sites that manages hugs amount of data in SPDS successfully.
That said, this is almost a political question. But from my PoV SAS haven't made any substantial updates for some time (even if experimenting with hdfs compability is interesting) - and is never mentioned throughout the conferences.
I guess you want to talk to tour SAS representatives and the for yourself decide how you value your requirements vs product lifecycle.
Data never sleeps
Occasional Contributor
Posts: 19

Re: SPDS basic question

Can you please explain below line ? Is it a storage that we can save data in it ? I though this is a component.



"there are sites that manages hugs amount of data in SPDS successfully."


Super User
Posts: 9,890

Re: SPDS basic question


@nayakig wrote:
Can you please explain below line ? Is it a storage that we can save data in it ? I though this is a component.



"there are sites that manages hugs amount of data in SPDS successfully."




It is a SAS component that deals primarily with storage:

Scalable Performance Data Server

 

It is usually set up with its own pool of physically separate disks to enhance parallel I/O. In times of heavily virtualized SAN storage, it has lost much of its usefulness, at least IMO.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 5,852

Re: SPDS basic question

I assume that you got confused by my hugs typo that should read "huge"...

Data never sleeps
Super Contributor
Posts: 276

Re: SPDS basic question

Hi,

 

Comparing GRID environment vs. SPDS, is like comparing Apples & Oranges! You'll be much better off with both working together :-)

 

- GRID environment (Be It SAS or Hadoop) typically provides

  - A shared, centrally managed analytic computing environment

  - High availability

  - Workload management to optimally process multiple applications and workloads to maximize overall throughput.

  - Flexibility to incrementally grow the computing infrastructure as the number of users and the size of data increase over time

  - The ability to do rolling maintenance and upgrades without any disruption to the user community

 

- Scalable Performance Data Server (SPD Server)

  - A client/server, multi-user data server designed to optimize storage and to speed the processing of large SAS data sets.

  - Parallelizes many of the SAS I/O functions such as WHERE processing and INDEX creation over multiple data partitions

  - Extends parallel capabilities to include GROUP BY processing and SQL passthru.

  - Requires an SMP machine and is designed to use all resources available on the machine to achieve maximum scalability.

  - The maximum benefit with SPD Server is gained when it is run on a machine with:
       multiple cpus
       multiple I/O channels
       multiple disks
       large amount of data to be partitioned

  - Provides a high performance data store of very large SAS data sets. Therefore, it is particularly suited as part of a data warehousing solution where the SAS system is being used to construct, manage and analyze enterprise-wide datamarts.

 

So to conclude:

Grid --> More Compute, Memory and local storage resources

While

SPDS --> Better way to handle and process LARGE SAS Tables stored on file systems

 

Hope this clarifies your understanding and perceptions of these two technologies and how they could complement each other rather than replace.

 

Ahmed

Occasional Contributor
Posts: 19

Re: SPDS basic question

Posted in reply to AhmedAl_Attar
Wow thank you Ahmed, that's a very clear explanation, one last query ... SPDS perspective what size is considered big ? for example dataset with 5millons rows, does SPDS is useful with this size ? even more ?


Super Contributor
Posts: 276

Re: SPDS basic question

Hi @nayakig,

I look at SAS Data sets sizes their file size (GB), rather than record count (Ks, Ms).

I would consider Big Data Set is any thing larger than 7GB, some cases, larger than 5GB. it's all depends on your storage and Network throughput.

 

If you have very wide data set with long textual columns, it does not need millions of records to accumulate to 5+ GB in storage, if you see what I mean. 

 

Ahmed

Trusted Advisor
Posts: 1,746

Re: SPDS basic question

[ Edited ]

Hello @Go,

 

I subscribe to the comment provided by @AhmedAl_Attar. SPDS is a mature product, but not legacy at all, but actually very recommended on many environments.

 

One of my customers, actually, is using SPDE tables for managing big loads of data, on a GRID environment. However, this model is already very small, and it has been recently advised by a SAS top-notch employee, to move to SPDS. I think @AhmedAl_Attar explains quite well the reasoning behind.

 

So, your question is about differences between HA (High Availability) and Performance. That is why they are 2 different stories, hard to compare to each other.

Super User
Posts: 5,852

Re: SPDS basic question

If you only look at table size (and not on other SPDS features such as security) 5M rows sounds small. You'll be fine off using Base/SPDE libraries.
Data never sleeps
Occasional Contributor
Posts: 19

Re: SPDS basic question

@LinusH @AhmedAl_Attar

 

thank you very much, Now I have the clarity Smiley Happy

Ask a Question
Discussion stats
  • 10 replies
  • 442 views
  • 9 likes
  • 6 in conversation