BookmarkSubscribeRSS Feed
Soumya
Calcite | Level 5

Hello,

I am using ( UNIQUESAVE=REP ) in my environment while updating SPDS table.But it has been observed that it is taking long long time when we have more records to update.

Can someone please help how I can tune the enviroment,so that update runs faster?

Thanks,

Soumyajit Das

11 REPLIES 11
LinusH
Tourmaline | Level 20

You give us merely nothing to go on.

How large is the target table?

Got a proper index, unique?

No of transaction rows?

Is the overall performance well in the environment?

HW sizing, memory/core parameter settings in SPDS...

Data never sleeps
Soumya
Calcite | Level 5

Hello,

We have several large tables.But when we have large number of rows to update to final table the proc append using UNIQUESAVE=REP takes abnormal time.

Below are the details of your questions. Can you tell me what is wrong in environment?Do we need to tune some SPDS options ?

Here are the below answer

How large is the target table?

Ans :  table size is 15 GB. for example for one table.

Got a proper index, unique?

Ans : Yes we have unique index in the table

No of transaction rows?

Ans : todays transaction rows 1422835,total rows in the table :233799286

Is the overall performance well in the environment?

Ans: we are facing abnormal real time when we have large number of rows to update.

HW sizing, memory/core parameter settings in SPDS...

Ans :

HW information :  We have 2 application server named bumblee and domino having below config:

Below is the system configuration for the servers Bumblebee and Domino.

For Bumblebee :

System Model: IBM,8205-E6B

Machine Serial Number: 1079CBP

Processor Type: PowerPC_POWER7

Processor Implementation Mode: POWER 7

Processor Version: PV_7_Compat

Number Of Processors: 16

Processor Clock Speed: 3550 MHz

CPU Type: 64-bit

Kernel Type: 64-bit

LPAR Info: 3 Bumblebee PROD

Memory Size: 98304 MB

Good Memory Size: 98304 MB

Platform Firmware level: AL720_090

Firmware Version: IBM,AL720_090

Console Login: enable

Auto Restart: true

Full Core: false

Paging Space Information

Total Paging Space: 98304MB

Percent Used: 1%

Volume Groups Information

==============================================================================

Active VGs

==============================================================================

utillocvg:

PV_NAME PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION

hdisk5 active 2047 0 00..00..00..00..00

hdisk6 active 399 0 00..00..00..00..00

hdisk7 active 4095 0 00..00..00..00..00

==============================================================================

sasworkvg:

PV_NAME PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION

For Domino :

System Model: IBM,8205-E6B

Machine Serial Number: 1079C9P

Processor Type: PowerPC_POWER7

Processor Implementation Mode: POWER 7

Processor Version: PV_7_Compat

Number Of Processors: 16

Processor Clock Speed: 3550 MHz

CPU Type: 64-bit

Kernel Type: 64-bit

LPAR Info: 4 Domino PROD

Memory Size: 98304 MB

Good Memory Size: 98304 MB

Platform Firmware level: AL720_090

Firmware Version: IBM,AL720_090

Console Login: enable

Auto Restart: true

Full Core: false

Paging Space Information

Total Paging Space: 98304MB

Percent Used: 1%

Volume Groups Information

==============================================================================

Inactive VGs

==============================================================================

altinst_rootvg

==============================================================================

Active VGs

==============================================================================

sasworkvg:

PV_NAME PV STATE          TOTAL PPs   FREE PPs    FREE DISTRIBUTION

hdisk1 active 4095 0 00..00..00..00..00

hdisk2 active 399 0 00..00..00..00..00

hdisk3 active 4095 0 00..00..00..00..00


SPDS core paramenter setting :


NOREVNUMUPDATE;

SORTSIZE=4096M;

STARSIZE=4096M;

INDEX_SORTSIZE=2048M;

GRPBYROWCACHE=4096M;

BINBUFSIZE=128K;

WORKPATH="/local/temp/spdswork/eip/work";

NOCOREFILE;

SEQIOBUFMIN=512K;

RANIOBUFMIN=4K;

MAXWHTHREADS=64;

MAXSEGRATIO=75;

WHERECOSTING;

RANDOMPLACEDPF;

MINPARTSIZE=512M;

TMPDOMAIN=SPDTEMP;

NLSTRANSCODE;

MINPORTNO=21100;

MAXPORTNO=21999;

WHEREAUDIT;

WHAUDLEN=4096;

SQLAUDLEN=4096;

SQLOPTS="reset no_qrwenable printlog plljoin nostarjoin=0 magic=103 buffersize=67108864 sqlhimem _method details='stj$dfrpdw$'" ;

IDLE_TIMEOUT=36000;

FMTDOMAIN=FORMATS;

FMTNAMENODE=localhost;

FMTNAMEPORT=5180;

LinusH
Tourmaline | Level 20

Now we are talking Smiley Happy

Been a while since I used UNIQUESAVE, but as I recall one could expect good performance.

It seems you've got a lot of memory, so raising INDEX_SORTSIZE= and SORTSIZE= wouldn't hurt. Perhaps, have you monitored the behaviour of your operation (CPU/MEMORY consumption)?

Also, be sure that you have SYNCADD=NO set.

If your target table is being fragmented according to PK sort order, it could be wise from time to time resort the whole table.

Data never sleeps
Soumya
Calcite | Level 5

Hello Linus,

Thanks for your reply. Smiley Happy

In our environment we are using default option which is SYNCADD=NO.Do you have any other idea instead of UNIQUESAVE=Rep for update which works faster?

Do you think if we sort the target table time to time then performance of update will increase?

If we increase memsize (currently we have memsize=512) then performance will increase?

LinusH
Tourmaline | Level 20

Hi,

UNIQUESAVE=Rep is similar to data step modyfy.

With approx 1% transaction/master ratio, one could suspect that would be the optimal strategy.

The table is not terrible big, so resort, apply indexs and test to see if there is any difference.

MEMSIZE is a SAS option, so i suspect that has limited effect on this process.

Again, monitor behaviour and adjust SPDS parameters/options if you are low on memory.

Data never sleeps
Soumya
Calcite | Level 5

Hello Linus,

Thanks for your reply.

I will try all the approach what you said.

Could you please help me what SPDS parameters/options should I use ? This is the current below setting:

SPDS core paramenter setting :


NOREVNUMUPDATE;

SORTSIZE=4096M;

STARSIZE=4096M;

INDEX_SORTSIZE=2048M;

GRPBYROWCACHE=4096M;

BINBUFSIZE=128K;

WORKPATH="/local/temp/spdswork/eip/work";

NOCOREFILE;

SEQIOBUFMIN=512K;

RANIOBUFMIN=4K;

MAXWHTHREADS=64;

MAXSEGRATIO=75;

WHERECOSTING;

RANDOMPLACEDPF;

MINPARTSIZE=512M;

TMPDOMAIN=SPDTEMP;

NLSTRANSCODE;

MINPORTNO=21100;

MAXPORTNO=21999;

WHEREAUDIT;

WHAUDLEN=4096;

SQLAUDLEN=4096;

SQLOPTS="reset no_qrwenable printlog plljoin nostarjoin=0 magic=103 buffersize=67108864 sqlhimem _method details='stj$dfrpdw$'" ;

IDLE_TIMEOUT=36000;

FMTDOMAIN=FORMATS;

FMTNAMENODE=localhost;

FMTNAMEPORT=5180;

Please suggest.

LinusH
Tourmaline | Level 20

First monitor, then adjust.

As mentioned earlier, INDEX_SORTSIZE and SORTSIZE are candidates for increasing.

Data never sleeps
Soumya
Calcite | Level 5

I have monitored the behavior already and observed that when we have more number of records to update then the job takes abnormal time.We do not face much problem while sorting.Do you still think that increasing of  INDEX_SORTSIZE and SORTSIZE will increase the performance of Update.Is there any impact of increasing of  INDEX_SORTSIZE and SORTSIZE to update?Please suggest.

Please see the below log comparing with yesterdays's and today's log.

NOTE: Updating table ...

MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;

MPRINT(ETLS_LOADER): run;

NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.

NOTE: There were 857669 observations read from the data set WORK.W5MRNL8Y.

NOTE: 1660 observations added.

NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857669 observations and 72 variables.

NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.

NOTE: PROCEDURE APPEND used (Total process time):

real time           1:41:53.81

user cpu time       4.69 seconds

system cpu time     0.47 seconds

memory 1761.40k

OS Memory           11560.00k

Timestamp 05.06.2015 kl 06.37

Page Faults 28

Page Reclaims 367

Page Swaps 0

Voluntary Context Switches        20335

Involuntary Context Switches      1990

Block Input Operations            0

Block Output Operations           0

     

yesterday this particular part took around 38 mins , even there were 13695 observatons added only.

NOTE: Updating table ...

MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;

MPRINT(ETLS_LOADER): run;

NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.

NOTE: There were 857544 observations read from the data set WORK.W5MRNL8Y.

NOTE: 13695 observations added.

NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857544 observations and 72 variables.

NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.

NOTE: PROCEDURE APPEND used (Total process time):

real time           38:06.05

user cpu time       4.13 seconds

system cpu time     0.34 seconds

memory 1761.40k

OS Memory           11560.00k

Timestamp 04.06.2015 kl 04.41

Page Faults 225

Page Reclaims 236

Page Swaps 0

Voluntary Context Switches        21507

Involuntary Context Switches      2027

Block Input Operations            0

Block Output Operations           0

LinusH
Tourmaline | Level 20

The SAS log is not enough for monitoring, the timings here shows only the SAS part, which is minimal in this operation. You may need to use OS tools for estimating CPU/MEMORY consumption.

Since this operation involves updating indexes, INDEX_SORTSIZE could be of interest.

Have you resorted the table according to the PK?

What is you settings of UNDO_POLICY= and SPDSAUNQ=?

If your table for some reason is to big for this operation, you may need to consider splitting it, and use a Dynamic Cluster instead. Requires a bit more coding, but could enhance query performance.

I assume that you have checked that the system is not busy doing a lot of other stuff?

Data never sleeps
Soumya
Calcite | Level 5

I have monitored the behavior already and observed that when we have more number of records to update then the job takes abnormal time.We do not face much problem while sorting.Do you still think that increasing of  INDEX_SORTSIZE and SORTSIZE will increase the performance of Update.Is there any impact of increasing of  INDEX_SORTSIZE and SORTSIZE to update?Please suggest.

Please see the below log comparing with yesterdays's and today's log.

NOTE: Updating table ...

MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;

MPRINT(ETLS_LOADER): run;

NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.

NOTE: There were 857669 observations read from the data set WORK.W5MRNL8Y.

NOTE: 1660 observations added.

NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857669 observations and 72 variables.

NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.

NOTE: PROCEDURE APPEND used (Total process time):

real time           1:41:53.81

user cpu time       4.69 seconds

system cpu time     0.47 seconds

memory 1761.40k

OS Memory           11560.00k

Timestamp 05.06.2015 kl 06.37

Page Faults 28

Page Reclaims 367

Page Swaps 0

Voluntary Context Switches        20335

Involuntary Context Switches      1990

Block Input Operations            0

Block Output Operations           0

     

yesterday this particular part took around 38 mins , even there were 13695 observatons added only.

NOTE: Updating table ...

MPRINT(ETLS_LOADER): proc append base = VL_SPDS.VL0201_217F_Purch_Doc_Dist ( UNIQUESAVE=REP ) data = WORK.W5MRNL8Y () force;

MPRINT(ETLS_LOADER): run;

NOTE: Appending WORK.W5MRNL8Y to VL_SPDS.VL0201_217F_PURCH_DOC_DIST.

NOTE: There were 857544 observations read from the data set WORK.W5MRNL8Y.

NOTE: 13695 observations added.

NOTE: The data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST has 857544 observations and 72 variables.

NOTE: Compressing data set VL_SPDS.VL0201_217F_PURCH_DOC_DIST decreased size by 52.94 percent.

NOTE: PROCEDURE APPEND used (Total process time):

real time           38:06.05

user cpu time       4.13 seconds

system cpu time     0.34 seconds

memory 1761.40k

OS Memory           11560.00k

Timestamp 04.06.2015 kl 04.41

Page Faults 225

Page Reclaims 236

Page Swaps 0

Voluntary Context Switches        21507

Involuntary Context Switches      2027

Block Input Operations            0

Block Output Operations           0

bheinsius
Lapis Lazuli | Level 10

for the record:

 

character compression increased my append time by 8 times.

binary compression increased my append time by 20 times.

 

so i left compression off.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 11 replies
  • 2200 views
  • 0 likes
  • 3 in conversation