DATA Step, Macro, Functions and more

Update DBMS tables faster from SAS

Reply
Occasional Contributor
Posts: 6

Update DBMS tables faster from SAS

[ Edited ]

Hi All,

 

as update is too slow in SAS Proc SQL 

 

we found a faster way to update big tables in a very fast way using hash and database views 

it's known that:

1- if you update in database view you will update the physical table 

2- hash update uses the memory (RAM) which might be slower if tables are very big. 

 

our solution is based on creating a view for all condition you want in the update statement in the view like 

simple solution:

PROC SQL;

UPDATE TABLE_A SET COLUMN_1 = {VALUE} WHERE condition_1 , condition_2 

;QUIT;

our solution

create view as select column_1 from table_a where condition_1, condition_2 

then use the simple hash update which uses less memory ,faster update 

happy faster hash update 

 

Abdelrahman Mahareek

AML Technical Consultant 

 

PROC Star
Posts: 1,555

Re: update SAS tables Faster!

Very interestiung. Can you tell us more? For example:

Which database and version?

Where are the view and hash table created?

How many records does the view point to?

Occasional Contributor
Posts: 6

Re: update SAS tables Faster!

Which database and version?

I tried this solution on Oracle and SQL server Databases

 

Where are the view and hash table created?

view is created on the database engine like 

create view xx as select * from table_1 ; in Oracle 

then i read this view in sas using libname oracle ...

you can update in this view and your update will reflect on the physical tables 

 

How many records does the view point to?

i tried it on large table about 3 Milion 

Super User
Posts: 6,927

Re: update SAS tables Faster!

You should show your code so we can try it.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 6

Re: update SAS tables Faster!

It has different techniques based on what you need to do 

suppose you want to update join from table_1 column party_name from table_2 column customer_name 

where party_number = customer_number 

and party_date_of_birth is missing 

 

First create view on table_1 say table_1_view on db say SQL server or Oracle

create view table_1_view  as

SELECT party_number, party_name 

From       Table_1 

where      party_date_of_birth is null; 

 

in SAS 

Proc SQL;

create table work.table_1_view_update as 

SELECT party_number , customer_name as party_name /* this alias for the hash */

FROM <DB_libname>.table_1_view a inner join <DB_libname>.table_2 b

on b.customer_number = a.party_number

;Quit;

 

now work.table_1_view has the update we need 

let's use the hash to update 

 

data <DB_libname>.table_1_view;
if _n_=1 then do;
declare hash ud(dataset:'work.table_1_view_update');
ud.defineKey('party_number');
ud.defineData('party_name');
ud.defineDone();
end;
modify <DB_libname>.table_1_view;
rcUpdate = ud.find();
if rcUpdate=0 then replace;
run;

 

 

i hope this helps

 

 

 

Super User
Super User
Posts: 6,495

Re: update SAS tables Faster!

Personally I never find it that useful to update a table in place.  SAS is not really designed to manage transactional databases.

 

How is this different than just creating an INDEX on the transaction table and using that instead of the HASH?

Also what about creating an INDEX on the master table instead of the transactions?

 

Does the HASH method run faster?

What are the limitations? It would seem to me that the HASH would limit updates to number that could fit in memory.  That might not be much of a real issue. If you are applying more updates than you can loading into a HASH object then you are probably using the wrong method to maintain the data.

 

PROC Star
Posts: 1,555

Re: update SAS tables Faster!

Ok I created a quick benchmark than anyone can run.

 

Your method is indeed faster than a non-indexed update. That's quite a process to have combined the modify statment and a hash table on RDBMS data. Kudos! 

 

Below I test: 

1-Table update in Oracle

2- View update in Oracle

3- View update with hash table in SAS

4- Table update in Oracle w/index 

 

The index wins hands down, but when that's not an option, the hash table saves a lot of time.

*0- Sample data;
data ORALIB.TESTBASE
     ORALIB.TESTUPDATE;
  do I=1 to 1e5;
    J=I*(ranuni(0) > 0.1); output ORALIB.TESTBASE;
    J=I;                   output ORALIB.TESTUPDATE;
  end;
run;
  

*Test 1 - Table update in Oracle - 60s; 
proc sql; 
  connect using ORALIB;
  execute by ORALIB(
    update TESTBASE
    set J = (select J from TESTUPDATE where TESTBASE.I=TESTUPDATE.I)
    where TESTBASE.J=0
  ); 
quit;


*Test 2 - View update in Oracle - 60s; 
proc sql; 
  connect using ORALIB;
  execute by ORALIB(
    create view TESTVIEW as 
    select I,J from TESTBASE 
    where TESTBASE.J=0
  ); 
  execute by ORALIB(
    update TESTVIEW
    set J = (select J from TESTUPDATE where TESTVIEW.I=TESTUPDATE.I)
  ); 
quit;


*Test 3 - View update with hash table in SAS - 18s; 
proc sql;
  create table UPD as 
  select u.* 
  from ORALIB.TESTVIEW   v
         inner join
       ORALIB.TESTUPDATE u
         on v.I=u.I;
quit;

data ORALIB.TESTVIEW;
  if _n_=1 then do;
    declare hash UPD(dataset:'UPD');
    UPD.defineKey('I');
    UPD.defineData('J');
    UPD.defineDone();
  end;
  modify ORALIB.TESTVIEW;
  RC = UPD.find();
  if RC=0 then replace;
run;


*Test 4 - Table update in Oracle w/index - 1s; 
proc sql; 
  connect using ORALIB;
  execute by ORALIB(
    create index I on TESTUPDATE(I)
  );
  execute by ORALIB(
    update TESTBASE
    set J = (select J from TESTUPDATE where TESTBASE.I=TESTUPDATE.I)
    where TESTBASE.J=0
  ); 
quit;

 

 

PROC Star
Posts: 1,555

Re: update SAS tables Faster!

[ Edited ]

<Erroneous reply removed> Thank you. This is an interesting idea. Will look at it more closely tomorrow.

Super User
Posts: 6,927

Re: update SAS tables Faster!

You should expand your subject line; it should read:

Update DBMS tables faster from SAS

as you are not updating native SAS tables.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 6

Re: update SAS tables Faster!

thanks Kurt 

Ask a Question
Discussion stats
  • 9 replies
  • 251 views
  • 6 likes
  • 4 in conversation