BookmarkSubscribeRSS Feed
JonasWetterberg
SAS Employee

A quite common task for most developer is joining to tables. If you need to join a SAS dataset with a database table, you can easily write a proc sql for that task. Behind the scenes proc sql will load your database table to your SAS session and perform the join in SAS. If the table in your database is large this will be a time-consuming task.

 

This tip will describe how you can avoid this performance problem by sending the small SAS to a DB2 session table, and make the join happen in DB2, and then only transfer the selected data from DB2 to your SAS session. If you are joining a small SAS table with a large DB2 table, this technique will increase the performance of your program.

 

Session tables in DB2 are similar to SAS work tables. They are declared within a DB2 session, and once that DB2 session ends, the tables are deleted. They are also unique to a session DB session. To use this, you must therefore declare all of your connections to be global.

 

Here is a code sample on how to use the technique!

***************************************************************************************************;
* Sample program that utilizes DB2 session tables to perform a join in DB2.                       *;
***************************************************************************************************; 
%let data_source = <name_of_your_db2_data_source>;
%let db2_schema  = <schema_in_db2_where_your_tables_are_stored>;

***************************************************************************************************;
* DB2 session tables are always stored in a schema called session. Make sure you create a         *;
* libname for this specific schema, and make sure you type in connection=global                   *;
***************************************************************************************************;
libname DBWork db2 datasrc=&data_source authdomain="DBAuth" connection=global schema="session" ;

***************************************************************************************************;
* Create the session table in DB2. Use the "with replace" option to make sure it is recreated     *;
* By using that statement, you will avoid execution error if the code executes more than once     *;
***************************************************************************************************;
proc sql noprint;
	connect to db2 (datasrc=&data_source authdomain="DBAuth" connection=global);
	execute (declare global temporary table
			 MyCustIDs (CustomerID varchar(50))
			 on commit preserve rows not logged with replace) by db2; 
quit;

***************************************************************************************************;
* Create a small SAS data set with just one customer. This data set is the one we want to join    *;
* with the big CustomerInfoTable in DB2.                                                          *; 
***************************************************************************************************;
data CustomerIDInSASDS;
	CustomerID = '4110200500';
run;

***************************************************************************************************;
* Send you SAS table to your DB2 session table. A little bit lazy code, but since both tables     *;
* share the same columns, you can use the select * functionality.                                 *;
***************************************************************************************************;
proc sql noprint;
	insert into DBWork.MyCustIDs select * from CustomerIDInSASDS;
quit;

***************************************************************************************************;
* Note that it is not likely that your other DB2 tables are stored in schema session in DB2       *;
* as that is more or less the same thing as SAS work but in DB2. So in order to use the           *;
* table you just created, we need to do it using a pass through SQL to make sure that the join    *;
* is executed in DB2, and just the result is passed to your SAS session.                          *;
***************************************************************************************************;
proc sql noprint;
	connect to db2 (datasrc=&data_source authdomain="DBAuth" connection=global);
	create table ATableInSAS as
	select * from connection to db2
		(select i.*
		 from &db2_schema..CustomerInfoTable cit inner join session.MyCustIDs c on cit.CustNbr = c.CustomerID);
quit;

****************************************************************************************************;
* The small table is now joined with the large one in DB2 and only the result is sent back to SAS  *;
* and you will find the result in the SAS table "ATableInSAS"                                      *;
****************************************************************************************************;

 

Good luck!

//Jonas

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Discussion stats
  • 0 replies
  • 1577 views
  • 5 likes
  • 1 in conversation