BookmarkSubscribeRSS Feed
Amahareek
Fluorite | Level 6

Hi All,

 

as update is too slow in SAS Proc SQL 

 

we found a faster way to update big tables in a very fast way using hash and database views 

it's known that:

1- if you update in database view you will update the physical table 

2- hash update uses the memory (RAM) which might be slower if tables are very big. 

 

our solution is based on creating a view for all condition you want in the update statement in the view like 

simple solution:

PROC SQL;

UPDATE TABLE_A SET COLUMN_1 = {VALUE} WHERE condition_1 , condition_2 

;QUIT;

our solution

create view as select column_1 from table_a where condition_1, condition_2 

then use the simple hash update which uses less memory ,faster update 

happy faster hash update 

 

Abdelrahman Mahareek

AML Technical Consultant 

 

9 REPLIES 9
ChrisNZ
Tourmaline | Level 20

Very interestiung. Can you tell us more? For example:

Which database and version?

Where are the view and hash table created?

How many records does the view point to?

Amahareek
Fluorite | Level 6

Which database and version?

I tried this solution on Oracle and SQL server Databases

 

Where are the view and hash table created?

view is created on the database engine like 

create view xx as select * from table_1 ; in Oracle 

then i read this view in sas using libname oracle ...

you can update in this view and your update will reflect on the physical tables 

 

How many records does the view point to?

i tried it on large table about 3 Milion 

Amahareek
Fluorite | Level 6

It has different techniques based on what you need to do 

suppose you want to update join from table_1 column party_name from table_2 column customer_name 

where party_number = customer_number 

and party_date_of_birth is missing 

 

First create view on table_1 say table_1_view on db say SQL server or Oracle

create view table_1_view  as

SELECT party_number, party_name 

From       Table_1 

where      party_date_of_birth is null; 

 

in SAS 

Proc SQL;

create table work.table_1_view_update as 

SELECT party_number , customer_name as party_name /* this alias for the hash */

FROM <DB_libname>.table_1_view a inner join <DB_libname>.table_2 b

on b.customer_number = a.party_number

;Quit;

 

now work.table_1_view has the update we need 

let's use the hash to update 

 

data <DB_libname>.table_1_view;
if _n_=1 then do;
declare hash ud(dataset:'work.table_1_view_update');
ud.defineKey('party_number');
ud.defineData('party_name');
ud.defineDone();
end;
modify <DB_libname>.table_1_view;
rcUpdate = ud.find();
if rcUpdate=0 then replace;
run;

 

 

i hope this helps

 

 

 

Tom
Super User Tom
Super User

Personally I never find it that useful to update a table in place.  SAS is not really designed to manage transactional databases.

 

How is this different than just creating an INDEX on the transaction table and using that instead of the HASH?

Also what about creating an INDEX on the master table instead of the transactions?

 

Does the HASH method run faster?

What are the limitations? It would seem to me that the HASH would limit updates to number that could fit in memory.  That might not be much of a real issue. If you are applying more updates than you can loading into a HASH object then you are probably using the wrong method to maintain the data.

 

ChrisNZ
Tourmaline | Level 20

Ok I created a quick benchmark than anyone can run.

 

Your method is indeed faster than a non-indexed update. That's quite a process to have combined the modify statment and a hash table on RDBMS data. Kudos! 

 

Below I test: 

1-Table update in Oracle

2- View update in Oracle

3- View update with hash table in SAS

4- Table update in Oracle w/index 

 

The index wins hands down, but when that's not an option, the hash table saves a lot of time.

*0- Sample data;
data ORALIB.TESTBASE
     ORALIB.TESTUPDATE;
  do I=1 to 1e5;
    J=I*(ranuni(0) > 0.1); output ORALIB.TESTBASE;
    J=I;                   output ORALIB.TESTUPDATE;
  end;
run;
  

*Test 1 - Table update in Oracle - 60s; 
proc sql; 
  connect using ORALIB;
  execute by ORALIB(
    update TESTBASE
    set J = (select J from TESTUPDATE where TESTBASE.I=TESTUPDATE.I)
    where TESTBASE.J=0
  ); 
quit;


*Test 2 - View update in Oracle - 60s; 
proc sql; 
  connect using ORALIB;
  execute by ORALIB(
    create view TESTVIEW as 
    select I,J from TESTBASE 
    where TESTBASE.J=0
  ); 
  execute by ORALIB(
    update TESTVIEW
    set J = (select J from TESTUPDATE where TESTVIEW.I=TESTUPDATE.I)
  ); 
quit;


*Test 3 - View update with hash table in SAS - 18s; 
proc sql;
  create table UPD as 
  select u.* 
  from ORALIB.TESTVIEW   v
         inner join
       ORALIB.TESTUPDATE u
         on v.I=u.I;
quit;

data ORALIB.TESTVIEW;
  if _n_=1 then do;
    declare hash UPD(dataset:'UPD');
    UPD.defineKey('I');
    UPD.defineData('J');
    UPD.defineDone();
  end;
  modify ORALIB.TESTVIEW;
  RC = UPD.find();
  if RC=0 then replace;
run;


*Test 4 - Table update in Oracle w/index - 1s; 
proc sql; 
  connect using ORALIB;
  execute by ORALIB(
    create index I on TESTUPDATE(I)
  );
  execute by ORALIB(
    update TESTBASE
    set J = (select J from TESTUPDATE where TESTBASE.I=TESTUPDATE.I)
    where TESTBASE.J=0
  ); 
quit;

 

 

ChrisNZ
Tourmaline | Level 20

<Erroneous reply removed> Thank you. This is an interesting idea. Will look at it more closely tomorrow.

Amahareek
Fluorite | Level 6

thanks Kurt 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 3047 views
  • 7 likes
  • 4 in conversation