Help using Base SAS procedures

Proc Sql Update Optimization

Reply
Super Contributor
Posts: 398

Proc Sql Update Optimization

Hello to everyone
I have an optimization question about the following query

proc sql ;
update Addr a
set A_ID =
(
select ID
from U_Addr
where US=upcase(a.OS)
and UC=upcase(a.OC)
and US=upcase(a.OS)
) ;
quit ;

This update takes a very long time to update the Addr table. The Addr table has 230,000 rows and the U_Addr has 67,000 rows. The joining field are character fields which I think is one of the reason it takes so long.
If anyone has an tips or comments on how to optimize this query and make it faster I would appreciate it greatly.

Thank You
Super Contributor
Super Contributor
Posts: 3,174

Re: Proc Sql Update Optimization

It would help to explain more about the SAS operating environment, particularly about the SAS code execution that leads up to the PROC SQL you have posted. Is SAS executing against local data or through a remote connection; is a DBMS (external table) involved? Is there variability with processing, depending on the row/observation count (try using OPTIONS OBS=nnn; for testing various scenarios).

Scott Barry
SBBWorks, Inc.
Super Contributor
Posts: 398

Re: Proc Sql Update Optimization

Thanks guys for the replies.
Everything with this query runs locally. What i'm doing is taking a distinct list of address. I'm adding an ID column to this distinct list then I want to go back and update each row with it's corresponding ID.

With a regular left join how would I do the update part of it?

Thanks again for your help
Jerry
Super User
Posts: 5,254

Re: Proc Sql Update Optimization

Since your master table seems quite small, you just can overwrite it with the query result:

proc sql;
create table Addr as
select a.*, u.A_ID
from Addr as a left join U_Addr as u
on u.US=upcase(a.OS)
and u.UC=upcase(a.OC)
and u.US=upcase(a.OS)
;
quit;

Remember if you try to use indexed strategies, avoid having function calls in WHERE/ON-clauses.

/Linus
Data never sleeps
Super Contributor
Posts: 398

Re: Proc Sql Update Optimization

Linus H,
Thank you for your help

That last sentence about the functions in the where clause opened my eyes. I did the upcase for these tables when I created them so I didn't have to do it in the join. The time when from 24 minutes to under 3.

Thank you so much

Jerry
Super User
Posts: 5,254

Re: Proc Sql Update Optimization

What goes on is that for each row in Addr, you do separate look-up in U_addr.
Indexing on some of the "join" columns might help.
In this particular case, you will probably find it much more efficient to do a regular left join to do this update, or data step techniques.

/Linus
Data never sleeps
Ask a Question
Discussion stats
  • 5 replies
  • 134 views
  • 0 likes
  • 3 in conversation