Hi,
I am just working on the large datasets(>10 TB) , want to extract the data from Oracle and DB2 and need to do data manipulation.
For this should I use the implicit or explict pass through facility in extracting the data and which one is efficient in performance wise.
Please let me know if there any efficiency techniques in handling the large datasets.
Thanks & Regards,
Siddhu1
With data that is that large, I would suggest keeping the data in the DB and using explicit pass through.
Simple technic to improving performance , only keep the columns you will be using and subset the data(rows/observations) if you are not using all of it either.
Hope this helps.
What @CarmineVerrell said: in the first step (where you will probably use explicit pass-through), reduce the dataset horizontally (variables) and vertically (observations).
From there, it depends on what you have in terms of dataset structure, and where you need to get for your analysis.
I highly recommend reading Tactics for Pushing SQL to the Relational
Databases. A little long, but packed with tips.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Early bird rate extended! Save $200 when you sign up by March 31.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.