BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

Occasionally questions about Data Management products will come up through channels outside of the Data management Community. Rather than simply answer the question in a vacuum and not publicize it, we thought we would give you, the community members a chance to post your thoughts on the answer. After a week, we will post the answer that was provided by our internal product experts so that you can see how the various responses compared.

This week we present a question about commit intervals. As a community member, what would be your approach?

“Does anyone having guidelines around what the commit intervals should be when using the Data Output Insert or Update nodes? Generally I’m a believer in the fewer commits the better (i.e. more rows per commit) but I could listen to other arguments. I also understand that it might depend on the application – a real-time query (so 1 row per commit) vs. a batch job (lots of rows per commit).”

1 ACCEPTED SOLUTION

Accepted Solutions
anna_holland
SAS Employee

Hi Patrick, I am sorry for the delay. Our internal product experts took a different approach to answering this use case scenario question.

"Since every commit requires opening and closing the connection to the data, you are correct that you should do them no more often than necessary. But you also don’t want them building up too much – it can cause timing issues or memory usage problems. For large (multi-million record) table writes, a commit every 1000 records or so doesn’t have a significant impact on write time."

As a community member, would you take this approach? Or would you approach this scenario with a different solution? I look forward to seeing your responses!

Anna-Marie

View solution in original post

4 REPLIES 4
LinusH
Tourmaline | Level 20

Perhaps the answer is in your question, it depends...

If one could wish for something, it would be that the product  would help in choosing the right commit interval/size - it is quite a technical parameter. Maybe som kind selection from different scenarios, even an optional wizard for complicated situations...

A part from the thing you mention are functionality in the target database, ease of reload/rerun, risk for failure, and of course, performance requirements.

Data never sleeps
Patrick
Opal | Level 21

Hi

"After a week, we will post the answer that was provided by our internal product experts so that you can see how the various responses compared."

Where can I find this answer?

Thanks

Patrick

TomKari
Onyx | Level 15

Late to the party...

I'm very skeptical that a "technical" decision could be made. There are too many subjective factors. The key things that I consider:

Online versus batch - probably THE most important;

Cost of redoing the work - relates to the previous, but could have other issues;

Importance of not losing the work - same idea;

Data volumes, size of records, number of associated tables;

Technical factors relating to the data management tool - what works best with Oracle doesn't necessarily work best with Teradata;

Nature of the processor complex that I'm running on - power of processors, speed of disk, etc.;

Length of time of the update window, in cases of batch.

I don't see how a software product could know enough about most of these factors to correctly implement a decision.

Looking forward to seeing the "correct" answer!

  Tom

anna_holland
SAS Employee

Hi Patrick, I am sorry for the delay. Our internal product experts took a different approach to answering this use case scenario question.

"Since every commit requires opening and closing the connection to the data, you are correct that you should do them no more often than necessary. But you also don’t want them building up too much – it can cause timing issues or memory usage problems. For large (multi-million record) table writes, a commit every 1000 records or so doesn’t have a significant impact on write time."

As a community member, would you take this approach? Or would you approach this scenario with a different solution? I look forward to seeing your responses!

Anna-Marie

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1484 views
  • 0 likes
  • 4 in conversation