BookmarkSubscribeRSS Feed
neha11_ch
Fluorite | Level 6

hi, 

 

I have a etl job running on spds server. The scheduled job on lsf is recently going into long run. The etl does not get completed in a days time and when server restarts automatically, job gets killed.

There are 32 lakhs of input records are present in the source table, that needs to be loaded into target table.

Job goes in hung state at scd type2 loader. No new record gets inserted in the spds target table. 

Kindly help me with the solution here.

 

Thanks

Neha

7 REPLIES 7
LinusH
Tourmaline | Level 20

If it doesn't abend, it's probably lack of resources, or some kind of really inefficient "programming"/ETL logic.

Try to work on a small subset from the sources, then carefully follow the log on the timing in each step.

Temporarily change from view to tables as transformation outputs.

If you are executing in DI Studio you could get help via the "Collect runtime statistics" option which present the results as a graph or table.

Data never sleeps
AndrewHowell
Moderator

Some other questions/suggestions:

  • As LinusH already said, turn on runtime statistics.
  • Review the Workspace server logs
  • Monitor your resources - could be one or more of the data locations is full (or SASWORK, or SASUTIL)
  • Are you running interactively or in batch?
  • Does this occur for just a particular source/target table? Are others running okay?
  • Include some precode to turn on LOTS of options and (if interactive) redirect the log to a text file so you don't lose it if you have to kill the job.
AndrewHowell
Moderator

A couple more:

  • Try a smaller source table
  • Try a (smaller?) source table of new records only
  • Try a (smaller?) source table of updated records only
  • Try altering your SCD2 options (I presume you're using Update/Insert, and if so are you using SAS or SQL sub-options?)
neha11_ch
Fluorite | Level 6
This problem occur for a particular source and target table only. All other similar etl jobs run perfectly fine. I tried running with small subset of data that is 10 records from source, it takes 3.5 hrs to load into the target table.
LinusH
Tourmaline | Level 20
Are both source and target in SPD Server?
Hw large is the target table?
Any indexes?
What version of DI Studio?
The type 2 Loader have many internal steps, look at log from the 3.5 hrs run and see which step consumes the time most.
In older versions there was a very inefficient SQL update in the type 2 Loader.
Data never sleeps
neha11_ch
Fluorite | Level 6
Yes both source and target are in SPDS.
Target table size is 70gb.
Table has only one composite index.
Version of DI is 4.901
It is the insert query to load the data into target table that takes longer time to execute in scd type 2 loader transformation.
LinusH
Tourmaline | Level 20
70gb is perhaps a too large table to be handled by the standardised code from the transformation.
Attach the log from the loader and a contents of the target table if you need further guidance.
Be sure to set the msglevel=I system option, and assign sodswdeb=YES to get additional information about the processing.
Data never sleeps

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 4035 views
  • 0 likes
  • 3 in conversation