BookmarkSubscribeRSS Feed
siddhu1
Obsidian | Level 7

Hi,

I am just working on the large datasets(>10 TB) , want to extract the data from Oracle and DB2 and need to do data manipulation.
For this should I use the implicit or explict pass through facility in extracting the data and which one is efficient in performance wise.

Please let me know if there any efficiency techniques in handling the large datasets.

 

Thanks & Regards,
Siddhu1

3 REPLIES 3
CarmineVerrell
SAS Employee

With data that is that large, I would suggest keeping the data in the DB and using explicit pass through.

 

Simple technic to improving performance , only keep the columns you will be using and subset the data(rows/observations)  if you are not using all of it either.

 

Hope this helps.

 

Kurt_Bremser
Super User

What @CarmineVerrell said: in the first step (where you will probably use explicit pass-through), reduce the dataset horizontally (variables) and vertically (observations).

From there, it depends on what you have in terms of dataset structure, and where you need to get for your analysis.

PeterClemmensen
Tourmaline | Level 20

I highly recommend reading Tactics for Pushing SQL to the Relational
Databases. A little long, but packed with tips.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 457 views
  • 2 likes
  • 4 in conversation