Hi,
I am just working on the large datasets(>10 TB) , want to extract the data from Oracle and DB2 and need to do data manipulation.
For this should I use the implicit or explict pass through facility in extracting the data and which one is efficient in performance wise.
Please let me know if there any efficiency techniques in handling the large datasets.
Thanks & Regards,
Siddhu1
With data that is that large, I would suggest keeping the data in the DB and using explicit pass through.
Simple technic to improving performance , only keep the columns you will be using and subset the data(rows/observations) if you are not using all of it either.
Hope this helps.
What @CarmineVerrell said: in the first step (where you will probably use explicit pass-through), reduce the dataset horizontally (variables) and vertically (observations).
From there, it depends on what you have in terms of dataset structure, and where you need to get for your analysis.
I highly recommend reading Tactics for Pushing SQL to the Relational
Databases. A little long, but packed with tips.
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.