Appreciate if someone help me with the following situation. I've a Table A in SAS which has some 2 million rows with 25 variables and
I've a other table B in Netezza which 10 millions of records with 42 variables. Now I want to join these two tables and I'm not sure whether I should use Implicit or explicit pass through for better performance. In either method, I would like to know the basic structure of SAS code to achieve this. I've done joins using only SAS datasets or only non SAS datasets but not with both.
You cannot use explicit pass through because that works on the server and you don't have the SAS data set on the server.
If you can load your table to the server, that's a different story.
Since you're using IMPLICIT queries, you can treat them both as 'SAS tables' and your code isn't any different.
Well, sending the 2mil rows to the database should be quicker than taking 10mil rows from the database, but by number alone. However its not that simple, there is the hardware available to each system for instance. That is quite a few records, you may want to test with a subset and get code working before running it in total. You may want to consult with your Netezza person.
As for code, its pretty similar either way,
proc sql; connect to net (...); select * from connection to net ( select ....); /* or to send to db*/ execute by net(create table yourworkspace as select * from saslib.datas); disconnect from net; quit; /* Or you could setup libnames and make the SQL a bit smaller.*/
Many examples out there on how to go one way or the other, but which would be better is hard to say and also depends a lot on what you want to do further with it, for instance if you were creating a SAS report, might make more sense to have all the data in SAS.
Also consider exporting to a flat file, copy that to the database server, read it into a table there, and do the join in the database. Just as one method to compare, performancewise.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.