BookmarkSubscribeRSS Feed
sandeep_reddy
Quartz | Level 8

hi all,

I am working with sas data management studio.In that i was using update node in one of my job.My target node is having 65 lakh records and my transaction is having 74000 records. when i execute update node with following query

update master a

set a.action_status=( select b.action_status from trans b where a.cust_id=b.cust_id)

when i ran the update node it is running for 1 1/2 day...and it is still running for the second day also..

can anyone help me on this and also share your views to increase the performance..

MANY THANKS IN ADVANCE.

5 REPLIES 5
Ksharp
Super User

Hash Table.

skillman
SAS Employee

Index the fields you are joining and the action_status field.

ajuvaba
Calcite | Level 5

Hi,

I don't have SAS Data Management Studio but in from your code I have one idea what could be the problem:

your update is making select from table b for every row in table a which means 6,5 million times a select from (74000 rows inner join with 6,5 million rows) with a answer of 1 action status.

this can't be fast!

However you could make a filter to limit the rows in a big table like this

update master a

set a.action_status=(select b.action_status from trans b where a.cust_id=b.cus_id)

where a.cust_id IN (select DISTINCT cust_id from trans)

or you rewrite it to be a data step:

data master;

set master trans;

BY cust_id;

run;

However, the second option had a need that data are sorted by cust_id. If you build an index by cust_id this could help also.

Just few thoughts

jakarman
Barite | Level 11

Many attention points to check and evaluate as there is too little background information. 

I = ksharp, although this will need a special user-transformation for DI

      The hash approach could be extended to a datastep vs SQL approach 

II = skillman optimizing datasets within SQL behavior

III = Change some SAS system options for better performance

More?

---->-- ja karman --<-----
skillman
SAS Employee

Are you using the Data Update node or the SQL Execute node? If you are not passing data to the data update node, and using database tables to determine the records to update (which it looks like you are doing based on the information provided) then I would use the SQL Execute node to push this update to the database and take advantage of it's memory/cpu.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1563 views
  • 0 likes
  • 5 in conversation