Re: DataFlux Cluster Generation ISsue

sht · Posted 01-03-2018 12:52 AM

Hi Team,

I am facing issues in Dataflux job which are developed for cluster id generation and scheduled thru Windows scheduler.

If I execute those jobs manually then it executes fine(around hrs) but in windows scheduler jobs are taking long time(around 8-9 hrs )

Please suggest what can be the issue

OS: Windows8

DF Studio: 8.1

Data size: More than 40lacs record

Thank You

RonAgresta · Posted 01-04-2018 11:23 AM

Hi - are you calling the dmpexec command from your scheduler? There are various options you can set when invoking the command that may impact performance, like choosing to write a log or selecting options that overwrite configuration file settings. There are also options for logging that might help you pinpoint the issue.

Another thought - I have seen in the past where dmpexec was executed by a user (different than the one used with Data Management Studio) that had different environment settings associated with it (like where the "temp" directory was located) and this impacted performance.

Ron

sht · Posted 01-16-2018 06:55 AM

Hi Ron,

Thanks for your reply..

I don't execute any dmpexec command.

These schedulers were usually executing smoothly but now it is taking longer time to execute than usual.

previously it get executed in 3hrs now it is taking 7-9 hrs.

No.of rows are increased hardly by 2-3 lacs

Thank you.

RonAgresta · Posted 01-16-2018 07:56 AM

Review the product documentation. There are topics that specifically deal with how you should be calling your DMP jobs if you are not running them interactively.

Regardless, check a few things:

Make sure that each call by your scheduler to run the job didn't spawn a new process that is taking up system resources
If your job is accessing data, make sure that your queries are running as you would expect. You can turn on more verbose levels of logging to see database interaction or you can monitor that in your database log.
There are logging options you can set that will generate what are called node profile metrics. These logs will tell you how long each node is processing rather than the sum for the entire job. This is also in the documentation.

Ron

DataFlux Cluster Generation ISsue