Solved: Performance measurement

saivenkat · Posted 09-28-2018 11:13 AM

Hello Everyone!

When I have copied a sas file in linux using the command dd if=have.sas7bdat of=test.sas7bdat

to know the io throughput rate of environment and noticed speed varies from 250 mbps to 400 mbps.

209194057728 bytes (209 GB) copied, 834.332 s, 251 MB/s

Whereas while I try to sort the data file using proc sort data=have; by key; run; it sorted the data in 621 seconds. Table sizes upto 195 gb so shall I consider SAS io rate as 321 mbps.

Am I measuring the performance right way?

Can you show some light in measuring the sas performance right way in standard manner like using above dd command in linux environment?

Thanks in advance!

TomKari · Posted 09-28-2018 01:53 PM

I recommend that if you're going to try to do this kind of assessment, you keep it as simple as possible.

So in your case, that would be:

data work.censusmerge;
set have_a;
run;

taking out the second dataset, and the "by" option. This will just result in a straight copy, which is what your comparing with.

Then on your "dd" or "cp" command, copy a dataset that's the same size (or even copy the actual "have_a.sas7bdat" from the same disk that "have_a" is on to the same disk that SAS work is using.

Very important; make sure there's no compression involved. That would mess up the comparison.

What you should see is that SAS does the same thing at a slightly lower data rate; when I've done this in the past it's been between 50 and 80 percent of the "dd" data rate. That's because adding a layer of processing always slows things down.

This is assuming that you're happy with the data transfer rate you're getting with your "dd" command. If that's significantly slower than you expect, then you have a non-SAS issue to diagnose.

Once you know that SAS is performing this test with reasonable throughput, you can compare it to other SAS processes. As others have mentioned, SORT isn't a great throughput test, as there's a ton of data transfer and processing in the background that doesn't appear in the final result.

View solution in original post

ballardw · Posted 09-28-2018 11:27 AM

I am not sure about what the "right way" might be.

Proc sort may not be the best procedure to try to do this with using your approach as there are temporary data sets also involved.

If you are trying to improve performance of sorts you can look at options such as TAGSORT which can significantly reduce temporary disk use.

JBailey · Posted 09-28-2018 11:33 AM

Hi @saivenkat

In the dd example you are reading data as is and writing it. Using PROC SORT you are adding a sort step to the process. I don't think this is comparable.

Using a DATA step to copy the data from one location to another would be more applicable if you are interested in pure IO rate. I am reasonably sure that there are SGF paper that discuss SAS I/O.

Best wishes,

Jeff

MarkWik · Posted 09-28-2018 11:51 AM

Plus, I do not understand the relevance in the question. The sort procedure vs dd command. The purposes in itself doesn't quite make sense to achieve a supposedly same goal. And where's the question of performance?

Am I missing something

saivenkat · Posted 09-28-2018 01:03 PM

I wanted to compare the IO Rate between "performance of dd or cp command Linux" and "SAS procedures or data step"

when I have executed the step below it consumed 18 mins to process datasets have_a and have_b. where each datasite sizes upto 195 gb so can I consider SAS IO rate here is 368 mbps. It looks optimal when compare to processing in Linux, where SAS compute got installed..

data work.censusmerge;
set have_a have_b;
by key;
run;

NOTE: There were 2176165646 observations read from the data set have_a.
NOTE: There were 2176165646 observations read from the data set have_a.
NOTE: The data set WORK.CENSUSMERGE has 4352331292 observations and 18 variables.
NOTE: DATA statement used (Total process time):
real time 18:05.02
user cpu time 9:59.07
system cpu time 8:05.97
memory 24984.06k
OS Memory 45516.00k
Timestamp 09/28/2018 09:27:49 PM
Step Count 6 Switch Count 1492
Page Faults 0
Page Reclaims 102891
Page Swaps 0
Voluntary Context Switches 5048
Involuntary Context Switches 1313
Block Input Operations 785244120
Block Output Operations 817162272

TomKari · Posted 09-28-2018 01:53 PM

I recommend that if you're going to try to do this kind of assessment, you keep it as simple as possible.

So in your case, that would be:

data work.censusmerge;
set have_a;
run;

taking out the second dataset, and the "by" option. This will just result in a straight copy, which is what your comparing with.

Then on your "dd" or "cp" command, copy a dataset that's the same size (or even copy the actual "have_a.sas7bdat" from the same disk that "have_a" is on to the same disk that SAS work is using.

Very important; make sure there's no compression involved. That would mess up the comparison.

What you should see is that SAS does the same thing at a slightly lower data rate; when I've done this in the past it's been between 50 and 80 percent of the "dd" data rate. That's because adding a layer of processing always slows things down.

This is assuming that you're happy with the data transfer rate you're getting with your "dd" command. If that's significantly slower than you expect, then you have a non-SAS issue to diagnose.

Once you know that SAS is performing this test with reasonable throughput, you can compare it to other SAS processes. As others have mentioned, SORT isn't a great throughput test, as there's a ton of data transfer and processing in the background that doesn't appear in the final result.

saivenkat · Posted 09-30-2018 01:58 PM

Thanks you very much for detailed response!

I would like to know few more insights about memory and cpu utilization.

If I allow the sas programs to process couple of billion records, free memory varies between 0 to 1% till the process gets completed but free CPU wouldn't go low beyond 95%.

I applied partitioning technique for the same amount of data using the dataset options firstobs and obs, and allowed all the partitions to trigger concurrently.. I could see the jobs getting completed without any issue and noticed both memory and cpu free percent is less than 1% until the process completed.

Can I consider, making the cpu utilisation beyong 95% is optimal way of processing the data? Do you foresee any risks of load failures in such conditions?

I had to ensure the conditions below to make partitioning technique successful

1) No temporary dataset allowed in the SAS program

2) No views exist in the SAS program

3) Distributed the partitions uniformly either by business key or total obs

Thanks again!

TomKari · Posted 09-30-2018 02:30 PM

First question, what effect on elapsed time did the partioning have? For example, if your job without partitioning ran in 1 hour, and you used four partitions, I could see the total elapsed time after partitioning being anywhere betweeen 15 minutes (the partitioning was very successful) to one hour (the partitioning had no effect whatsoever).

Another factor is the nature of the job, whether it's heavily I/O oriented or CPU oriented. That will have a huge impact on how much effect partitioning has.

Also, how many processors are in your server, and how heavily loaded it is with other tasks. If you can monitor it with system monitoring tools, you might gain some insight into this.

Note that many of the SAS procedures are designed to use as much memory as is available to them, to run as fast as possible. CPU usage, on the other hand, will depend on the nature of the workload.

Performance measurement on a modern server environment with a product like SAS is a very interesting process, but can be challenging.

saivenkat · Posted 10-01-2018 01:42 AM

Partitioning technique is reducing the elapsed time by 50 to 75% (60 mins reduced to 15 mins, another place 80 mins reduced to 25 mins). Out of which, most of the time getting consumed for overhead processes like below...

1) preparing the source data(creating formats, sort the source data, and determining the partition size(when partitions based on business keys) etc.,)

2) consolidate the partitioned target data into a single dataset

3) cleanup the intermediate tables

We have 1 processor chip and other CPU resources are listed below...

x cpuinfo: GenuineIntel Intel(R) Xeon(R) CPU E5-4655 v3 @ 2.90GHz x
x cpuinfo: Hz=3126.335 bogomips=5808.16 x
x cpuinfo: ProcessorChips=1 PhyscalCores=6 x
x cpuinfo: Hyperthreads =2 VirtualCPUs =48 x
x # of CPUs: 48

Can you highlight the risks of load failures that may expect in future if the CPU utilization goes beyond 95%?

I am more interested in knowing if making the CPU utilization upto 100% during off peak hours (for batch loads), when it is the only process getting executed. Can it be considered as optimal way of utilizing the resources?

I had seen issues with work tables, and views(applied fix as replacing these with permanent tables). SAS had difficulties in closing the sessions when it had crated work tables for processing as highlighted in my previous post. I had executed it several times and ensure no errors thrown. I wanted to be ready with fixes in advance, if any issue can be expected in future

Performance measurement

Re: Performance measurement

Re: Performance measurement

Re: Performance measurement

Re: Performance measurement

Re: Performance measurement

Re: Performance measurement

Re: Performance measurement

Re: Performance measurement

Re: Performance measurement

Registration is open