Hello,
I am using ( UNIQUESAVE=REP ) in my environment while updating SPDS table.But it has been observed that it is taking long long time when we have more records to update.
Can someone please help how I can tune the enviroment,so that update runs faster?
Thanks,
Soumyajit Das
You give us merely nothing to go on.
How large is the target table?
Got a proper index, unique?
No of transaction rows?
Is the overall performance well in the environment?
HW sizing, memory/core parameter settings in SPDS...
Hello,
We have several large tables.But when we have large number of rows to update to final table the proc append using UNIQUESAVE=REP takes abnormal time.
Below are the details of your questions. Can you tell me what is wrong in environment?Do we need to tune some SPDS options ?
Here are the below answer
How large is the target table?
Ans : table size is 15 GB. for example for one table.
Got a proper index, unique?
Ans : Yes we have unique index in the table
No of transaction rows?
Ans : todays transaction rows 1422835,total rows in the table :233799286
Is the overall performance well in the environment?
Ans: we are facing abnormal real time when we have large number of rows to update.
HW sizing, memory/core parameter settings in SPDS...
Ans :
HW information : We have 2 application server named bumblee and domino having below config:
Below is the system configuration for the servers Bumblebee and Domino.
For Bumblebee :
System Model: IBM,8205-E6B
Machine Serial Number: 1079CBP
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 16
Processor Clock Speed: 3550 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 3 Bumblebee PROD
Memory Size: 98304 MB
Good Memory Size: 98304 MB
Platform Firmware level: AL720_090
Firmware Version: IBM,AL720_090
Console Login: enable
Auto Restart: true
Full Core: false
Paging Space Information
Total Paging Space: 98304MB
Percent Used: 1%
Volume Groups Information
==============================================================================
Active VGs
==============================================================================
utillocvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk5 active 2047 0 00..00..00..00..00
hdisk6 active 399 0 00..00..00..00..00
hdisk7 active 4095 0 00..00..00..00..00
==============================================================================
sasworkvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
For Domino :
System Model: IBM,8205-E6B
Machine Serial Number: 1079C9P
Processor Type: PowerPC_POWER7
Processor Implementation Mode: POWER 7
Processor Version: PV_7_Compat
Number Of Processors: 16
Processor Clock Speed: 3550 MHz
CPU Type: 64-bit
Kernel Type: 64-bit
LPAR Info: 4 Domino PROD
Memory Size: 98304 MB
Good Memory Size: 98304 MB
Platform Firmware level: AL720_090
Firmware Version: IBM,AL720_090
Console Login: enable
Auto Restart: true
Full Core: false
Paging Space Information
Total Paging Space: 98304MB
Percent Used: 1%
Volume Groups Information
==============================================================================
Inactive VGs
==============================================================================
altinst_rootvg
==============================================================================
Active VGs
==============================================================================
sasworkvg:
PV_NAME PV STATE TOTAL PPs FREE PPs FREE DISTRIBUTION
hdisk1 active 4095 0 00..00..00..00..00
hdisk2 active 399 0 00..00..00..00..00
hdisk3 active 4095 0 00..00..00..00..00
SPDS core paramenter setting :
NOREVNUMUPDATE;
SORTSIZE=4096M;
STARSIZE=4096M;
INDEX_SORTSIZE=2048M;
GRPBYROWCACHE=4096M;
BINBUFSIZE=128K;
WORKPATH="/local/temp/spdswork/eip/work";
NOCOREFILE;
SEQIOBUFMIN=512K;
RANIOBUFMIN=4K;
MAXWHTHREADS=64;
MAXSEGRATIO=75;
WHERECOSTING;
RANDOMPLACEDPF;
MINPARTSIZE=512M;
TMPDOMAIN=SPDTEMP;
NLSTRANSCODE;
MINPORTNO=21100;
MAXPORTNO=21999;
WHEREAUDIT;
WHAUDLEN=4096;
SQLAUDLEN=4096;
SQLOPTS="reset no_qrwenable printlog plljoin nostarjoin=0 magic=103 buffersize=67108864 sqlhimem _method details='stj$dfrpdw$'" ;
IDLE_TIMEOUT=36000;
FMTDOMAIN=FORMATS;
FMTNAMENODE=localhost;
FMTNAMEPORT=5180;
Now we are talking
Been a while since I used UNIQUESAVE, but as I recall one could expect good performance.
It seems you've got a lot of memory, so raising INDEX_SORTSIZE= and SORTSIZE= wouldn't hurt. Perhaps, have you monitored the behaviour of your operation (CPU/MEMORY consumption)?
Also, be sure that you have SYNCADD=NO set.
If your target table is being fragmented according to PK sort order, it could be wise from time to time resort the whole table.
Hello Linus,
Thanks for your reply.
In our environment we are using default option which is SYNCADD=NO.Do you have any other idea instead of UNIQUESAVE=Rep for update which works faster?
Do you think if we sort the target table time to time then performance of update will increase?
If we increase memsize (currently we have memsize=512) then performance will increase?
Hi,
UNIQUESAVE=Rep is similar to data step modyfy.
With approx 1% transaction/master ratio, one could suspect that would be the optimal strategy.
The table is not terrible big, so resort, apply indexs and test to see if there is any difference.
MEMSIZE is a SAS option, so i suspect that has limited effect on this process.
Again, monitor behaviour and adjust SPDS parameters/options if you are low on memory.
Hello Linus,
Thanks for your reply.
I will try all the approach what you said.
Could you please help me what SPDS parameters/options should I use ? This is the current below setting:
SPDS core paramenter setting :
NOREVNUMUPDATE;
SORTSIZE=4096M;
STARSIZE=4096M;
INDEX_SORTSIZE=2048M;
GRPBYROWCACHE=4096M;
BINBUFSIZE=128K;
WORKPATH="/local/temp/spdswork/eip/work";
NOCOREFILE;
SEQIOBUFMIN=512K;
RANIOBUFMIN=4K;
MAXWHTHREADS=64;
MAXSEGRATIO=75;
WHERECOSTING;
RANDOMPLACEDPF;
MINPARTSIZE=512M;
TMPDOMAIN=SPDTEMP;
NLSTRANSCODE;
MINPORTNO=21100;
MAXPORTNO=21999;
WHEREAUDIT;
WHAUDLEN=4096;
SQLAUDLEN=4096;
SQLOPTS="reset no_qrwenable printlog plljoin nostarjoin=0 magic=103 buffersize=67108864 sqlhimem _method details='stj$dfrpdw$'" ;
IDLE_TIMEOUT=36000;
FMTDOMAIN=FORMATS;
FMTNAMENODE=localhost;
FMTNAMEPORT=5180;
Please suggest.
First monitor, then adjust.
As mentioned earlier, INDEX_SORTSIZE and SORTSIZE are candidates for increasing.
I have monitored the behavior already and observed that when we have more number of records to update then the job takes abnormal time.We do not face much problem while sorting.Do you still think that increasing of INDEX_SORTSIZE and SORTSIZE will increase the performance of Update.Is there any impact of increasing of INDEX_SORTSIZE and SORTSIZE to update?Please suggest.
Please see the below log comparing with yesterdays's and today's log.
NOTE: Updating table ...
MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;
MPRINT(ETLS_LOADER): run;
NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.
NOTE: There were 857669 observations read from the data set WORK.W5MRNL8Y.
NOTE: 1660 observations added.
NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857669 observations and 72 variables.
NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.
NOTE: PROCEDURE APPEND used (Total process time):
real time 1:41:53.81
user cpu time 4.69 seconds
system cpu time 0.47 seconds
memory 1761.40k
OS Memory 11560.00k
Timestamp 05.06.2015 kl 06.37
Page Faults 28
Page Reclaims 367
Page Swaps 0
Voluntary Context Switches 20335
Involuntary Context Switches 1990
Block Input Operations 0
Block Output Operations 0
yesterday this particular part took around 38 mins , even there were 13695 observatons added only.
NOTE: Updating table ...
MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;
MPRINT(ETLS_LOADER): run;
NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.
NOTE: There were 857544 observations read from the data set WORK.W5MRNL8Y.
NOTE: 13695 observations added.
NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857544 observations and 72 variables.
NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.
NOTE: PROCEDURE APPEND used (Total process time):
real time 38:06.05
user cpu time 4.13 seconds
system cpu time 0.34 seconds
memory 1761.40k
OS Memory 11560.00k
Timestamp 04.06.2015 kl 04.41
Page Faults 225
Page Reclaims 236
Page Swaps 0
Voluntary Context Switches 21507
Involuntary Context Switches 2027
Block Input Operations 0
Block Output Operations 0
The SAS log is not enough for monitoring, the timings here shows only the SAS part, which is minimal in this operation. You may need to use OS tools for estimating CPU/MEMORY consumption.
Since this operation involves updating indexes, INDEX_SORTSIZE could be of interest.
Have you resorted the table according to the PK?
What is you settings of UNDO_POLICY= and SPDSAUNQ=?
If your table for some reason is to big for this operation, you may need to consider splitting it, and use a Dynamic Cluster instead. Requires a bit more coding, but could enhance query performance.
I assume that you have checked that the system is not busy doing a lot of other stuff?
I have monitored the behavior already and observed that when we have more number of records to update then the job takes abnormal time.We do not face much problem while sorting.Do you still think that increasing of INDEX_SORTSIZE and SORTSIZE will increase the performance of Update.Is there any impact of increasing of INDEX_SORTSIZE and SORTSIZE to update?Please suggest.
Please see the below log comparing with yesterdays's and today's log.
NOTE: Updating table ...
MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;
MPRINT(ETLS_LOADER): run;
NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.
NOTE: There were 857669 observations read from the data set WORK.W5MRNL8Y.
NOTE: 1660 observations added.
NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857669 observations and 72 variables.
NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.
NOTE: PROCEDURE APPEND used (Total process time):
real time 1:41:53.81
user cpu time 4.69 seconds
system cpu time 0.47 seconds
memory 1761.40k
OS Memory 11560.00k
Timestamp 05.06.2015 kl 06.37
Page Faults 28
Page Reclaims 367
Page Swaps 0
Voluntary Context Switches 20335
Involuntary Context Switches 1990
Block Input Operations 0
Block Output Operations 0
yesterday this particular part took around 38 mins , even there were 13695 observatons added only.
NOTE: Updating table ...
MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;
MPRINT(ETLS_LOADER): run;
NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.
NOTE: There were 857544 observations read from the data set WORK.W5MRNL8Y.
NOTE: 13695 observations added.
NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857544 observations and 72 variables.
NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.
NOTE: PROCEDURE APPEND used (Total process time):
real time 38:06.05
user cpu time 4.13 seconds
system cpu time 0.34 seconds
memory 1761.40k
OS Memory 11560.00k
Timestamp 04.06.2015 kl 04.41
Page Faults 225
Page Reclaims 236
Page Swaps 0
Voluntary Context Switches 21507
Involuntary Context Switches 2027
Block Input Operations 0
Block Output Operations 0
for the record:
character compression increased my append time by 8 times.
binary compression increased my append time by 20 times.
so i left compression off.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.