SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

SAS DI job going into long run

Reply
Occasional Contributor
Posts: 8

SAS DI job going into long run

hi, 

 

I have a etl job running on spds server. The scheduled job on lsf is recently going into long run. The etl does not get completed in a days time and when server restarts automatically, job gets killed.

There are 32 lakhs of input records are present in the source table, that needs to be loaded into target table.

Job goes in hung state at scd type2 loader. No new record gets inserted in the spds target table. 

Kindly help me with the solution here.

 

Thanks

Neha

Super User
Posts: 5,431

which

Posted in reply to neha11_ch

If it doesn't abend, it's probably lack of resources, or some kind of really inefficient "programming"/ETL logic.

Try to work on a small subset from the sources, then carefully follow the log on the timing in each step.

Temporarily change from view to tables as transformation outputs.

If you are executing in DI Studio you could get help via the "Collect runtime statistics" option which present the results as a graph or table.

Data never sleeps
Moderator
Posts: 252

Re: which

Some other questions/suggestions:

  • As LinusH already said, turn on runtime statistics.
  • Review the Workspace server logs
  • Monitor your resources - could be one or more of the data locations is full (or SASWORK, or SASUTIL)
  • Are you running interactively or in batch?
  • Does this occur for just a particular source/target table? Are others running okay?
  • Include some precode to turn on LOTS of options and (if interactive) redirect the log to a text file so you don't lose it if you have to kill the job.
Moderator
Posts: 252

Re: which

Posted in reply to AndrewHowell

A couple more:

  • Try a smaller source table
  • Try a (smaller?) source table of new records only
  • Try a (smaller?) source table of updated records only
  • Try altering your SCD2 options (I presume you're using Update/Insert, and if so are you using SAS or SQL sub-options?)
Occasional Contributor
Posts: 8

Re: SAS DI job going into long run

Posted in reply to neha11_ch
This problem occur for a particular source and target table only. All other similar etl jobs run perfectly fine. I tried running with small subset of data that is 10 records from source, it takes 3.5 hrs to load into the target table.
Super User
Posts: 5,431

Re: SAS DI job going into long run

Posted in reply to neha11_ch
Are both source and target in SPD Server?
Hw large is the target table?
Any indexes?
What version of DI Studio?
The type 2 Loader have many internal steps, look at log from the 3.5 hrs run and see which step consumes the time most.
In older versions there was a very inefficient SQL update in the type 2 Loader.
Data never sleeps
Occasional Contributor
Posts: 8

Re: SAS DI job going into long run

Posted in reply to neha11_ch
Yes both source and target are in SPDS.
Target table size is 70gb.
Table has only one composite index.
Version of DI is 4.901
It is the insert query to load the data into target table that takes longer time to execute in scd type 2 loader transformation.
Super User
Posts: 5,431

Re: SAS DI job going into long run

Posted in reply to neha11_ch
70gb is perhaps a too large table to be handled by the standardised code from the transformation.
Attach the log from the loader and a contents of the target table if you need further guidance.
Be sure to set the msglevel=I system option, and assign sodswdeb=YES to get additional information about the processing.
Data never sleeps
Ask a Question
Discussion stats
  • 7 replies
  • 2446 views
  • 0 likes
  • 3 in conversation