Re: User define formats slow down performance?

Frank_Boekamp · Posted 10-12-2016 03:44 AM

I tested applying user define formats on a large table in SAS VA, putting UDF's on a number of character variables. The test results:

The table size decreased by factor 3

The time to view reports in SAS VA using this table increased by factor 2 !

This is not what I expected. As in SAS Base I expected a significant improvement in performance.

Am I missing something? Why would performance decrease using UDF's?

LinusH · Posted 10-12-2016 04:57 AM

What version are you using?

Data never sleeps

Kurt_Bremser · Posted 10-12-2016 05:18 AM

VA is based on an in-memory backend server. Therefore the reduction in data size does not have the effect that it has on a conventional server that (mostly) reads from disk.

But the additional application of a format causes additional CPU time spent on the evaluation of the formats. That causes the slowdown.

So I'd declare a rule-of-thumb: for storage and pre-analysis manipulation, use raw values with formats; for analysis in VA, apply the format and use the formatted values.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Frank_Boekamp · Posted 10-12-2016 05:37 AM

I understand IO is no longer in the chain of processing using SAS VA, this is why SAS VA is so much quicker than conventional tools that have to transfer data first in internal memory from staged data.

But: size still matters!

Data still has to be processed bij CPU to execute queries and presenting results. So when using UDF's with SAS VA I still expect an improvement in performance with large tables because much less data has to be processed by CPU on the server. The UDF's only have to be applied on the end result: that is the graphical object in SAS VA on the PC.

We tested this with SAS VA 7.1 (we currently are using SAS VA 7.3).

Kurt_Bremser · Posted 10-12-2016 06:03 AM

On the low level of CPU assembler (or micro-) code, conditional evaluations are the most costly operations.

Applying a format involves lots of those, while piping data through 64- and 128-bit memory buses at GHz speeds takes literally no time at all.

The overhead for format-evaluation in conventional Base SAS environments is completely hidden by the reduced I/O, which is several orders of magnitude slower than memory access and ALU operations.

Believe me, I've had this discussion back when I studied computer science.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Frank_Boekamp · Posted 10-12-2016 07:57 AM

Hi Kurt,

I don not understand the part about format evaluation. We don't use variables with UDF's in any filtering (using where or having logic). As I understand the query on the server is based on the unformatted data and the UDF's are only applied to the (very small) result table. I don't know were the UDF's are applied, it could be on the server side or the client side, only SAS knows. Are the UDF's pre-loaded in memory? only SAS knows. The UDF's we used for testing are very small tables (about 5 to 10 rows each). Why would it take so long to join a smal result table with some small UDF's? There is no logic in that for me.

At the Global Forum 2016 in LV some guys of SAS told me it was a good idea to apply UDF's to improve performance. Maybe we have configured something wrong? I hope someone can shine a light on me.

Cheers,

Frank

Kurt_Bremser · Posted 10-12-2016 08:32 AM

You are using character formats. So you have something like

proc format;
value testfmt
  'A' = 'Value A'
  'B' = 'Value B'
  'C' = 'Value C'
;
run;

When SAS applies this format to a character value, deep down the CPU code looks like that:

- if register1 = (binary code for 'A') (output memory cells holding "Value A')

- else if register1 = (binary code for 'B') ......

and so on. So you see it develops into a series of comparisons that have to be done, and those are costly, in terms of CPU cycles.

Sometimes, when following scalar types, the compiler can optimize that by indexing into an array structure, but when you have distinct values that cannot be used as an index by themselves, that won't work.

So the "simple" (on the surface) process of exchanging the raw value with its formatted counterpart becomes a rather tedious operation (for the CPU), while simply streaming longer strings to the output from a given memory location is a very time-effective one.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

olli · Posted 10-12-2016 10:20 AM

Hi Frank,

I have run into the same issue of user defined formats (particularly those with very high cardinality) slowing down report performance whenever data items with UDFs are being used on the report. I have found a workaround for character data (that I mostly use UDFs for) where I create an additional data item with "[raw]"-modifier to force VA to use unformatted values (i.e. define a new data item with something like the following formula: 'Data Item with UDF'n[Raw]) and then use this new data item everywhere instead of the original. This speeds up the report significantly and for some reason all the values are still shown as formatted according to the specified UDF.

I am using VA 7.1 in a non-distributed environment.

Frank_Boekamp · Posted 10-12-2016 10:55 AM

Hi Olli,

This is great tip I will try for sure.

Thanks!

Frank_Boekamp · Posted 10-12-2016 10:53 AM

Hi Kurt,

Your explanation is clear to me and I can understand that the "simple" (on the surface) process of exchanging the raw value with its formatted counterpart becomes a rather tedious operation (for the CPU).

This process would indeed need lots of CPU cycles if the UDF's are applied on the input table(s) of a query. In that case all needed variables in the (large) input table(s) with UDF's need to be converted applying the UDF's. But: this operation does not need to be executed om the (large) input table(s) of a query. The UDF's only have to be applied to the small output table of the query (that is the result table that will be sent back to the client PC). This is why the results are puzzling me.

Cheers,

Frank

SASKiwi · Posted 10-12-2016 05:56 PM

Do you have access to VA Environment Manager? It may be helpful to check the VA App server performance stats when you are using UDFs versus when you are not. Is CPU usage spiking for example?

Also how many formats are being applied?

Frank_Boekamp · Posted 10-17-2016 08:51 AM

We did some new testing, now on SAS VA 7.3 instead of SAS VA 7.1. We did apply 15 character UDF's. The table has about 43 million rows.

Here are some (hopefull) results:

- The table with the formatted columns shrank about factor 2

- The performance of the report stays about the same

Only logical explanation I can think of for this result is the remark of Kurt Bremser (see this thread):

"piping data through 64- and 128-bit memory buses at GHz speeds takes literally no time at all"

Using new computed columns with [raw] added as Olli suggested didn't make a difference.

So if this all is solid testing my conclusion is that using UDF's can save you a lot of internal memory usage.

Kurt_Bremser · Posted 10-17-2016 09:04 AM

Good to hear (read).

Just today a new thought came to my mind.

What happens when unformatted larger strings are displayed?

The CPU looks for the bytes in memory and moves them to the display engine.

What happens when short strings with larger formatted values are displayed?

The CPU retrieves the short string from memory, evaluates the format, and then moves the longer formatted string (from memory, at least from cache) to the display engine.

So, in the end, as long as the unformatted larger strings don't cause the system to use paging space, the formatted approach doesn't really reduce the amount of work (data shuffling) the CPU has to do (rather the opposite), and with total in-memory processing that's the single bottleneck remaining.

The advantage of formatted strings here is simply that more data for concurrent analysis can be loaded into memory.

This is contrary to the classical approach when reading from disk. Here the advantage of reading less data from disk massively outweighs the penalty of format evaluation in RAM.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Registration is open