Thanks Kurt & Gergely. In my second experiment, data is distributed across 3 disks and same proc sql (start join) is executed but still performance is not improved. In my case, IO doesn't seem to be bottleneck because whenever I change SGIO settings, IO changes to higher or lower values with execution time remaining same and same confirmed from perfmon capture results. In my server case, CPU utilization is seem to be problem. It has 12 cores but it uses only one core steadily during 10 minutes of execution and only once in mid of execution, other cores are used at which sort process runs. My understanding was that SPDE allows multithreading at core levels but that's not true. After reading your reply carefully, it sound like single IO fetches data from single disk or multiple disk so SPDE is meant to utilize that single IO to read data from all the disks at a time. Thats why SPDE was built to divide data and club together after reading data. But now a days, hardware are of superior at cheap so may be IO problem resolved than it use to be earlier.
... View more