BookmarkSubscribeRSS Feed
chuckdee4
Calcite | Level 5
Hi there, i have an issue with certain large datasets that i am currently working on.
They contain way too much data and are a nightmare to query.

I just wanted to find out the best way of querying such datasets i.e DATA steps or PROC SQL or any other way.
Plus if there are any tips of working with such datasets it would be much appreciated.
3 REPLIES 3
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Nearly always, it will be a responses like: "It depends.", "your mileage may vary." and "What do you consider to be large?"

The SAS system is influenced not only by data but mostly by the operating environment (client, server and/or both).

However, some SAS features to consider exploiting are listed below:

- SAS view
- SAS index
- WHERE statement / clause, instead of IF.
- use or don't use COMPRESS= option for efficiency (data dependent).
- using PROC FORMAT for user-defined formatted display rather than un-normalized SAS data (where extra data variables are carried along unnecessarily).

Do take advantage of the SAS.COM support website where there are topic-related papers, technical reference material, as well as SAS-hosted documentation, such as a "companion" guide for each supported Operating System (OS) environment where SAS runs.

Suggest you get your "query" code defined, tested, then come back to the forum with a specific "performance" or efficiency/effectiveness issue / problem / question, for focused attention / feedback from the forum subscribers.

Scott Barry
SBBWorks, Inc.
Ksharp
Super User
As far as I know Proc format and Hash Table are most fast way to execute query,especially for large table.



Ksharp
Peter_C
Rhodochrosite | Level 12
SAS Scalable Performance Data Server provides an engine to handle large data As it is not always available, you may find the smaller brother SPDE (a SAS library engine) helpful. SPDE achieves performance in several ways. I think the main 2 are partitioning and index optimisation. It is just great the way multi-gigabyte tables perform when partitioned and indexed well. Use the system option MSGLEVEL=i to see which indexes are used.
the great thing about the (big brother) SPDS is that it is a separate server - increasing the capacity of the service to solve your query, and it provides further index handling optimisation. It is a bit like a database server optimised for SAS queries.
Good luck
peterC

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 998 views
  • 0 likes
  • 4 in conversation