1. Aim and Use Case
The aim of our benchmarking is to determine the best storage option for a BigData Analytical repository accessed by CAS. I wanted to determine which solution would be optimal from the cost/performance point of view.
I imagine that the primary way for using the storage is to load data into CAS memory and process it there. After the necessary calculations (exploration, analysis, modeling etc.) the dataset will be saved for future use. No intermittent or intensive back-and-forth transfer will be used.
2. Setup
In total, 7 options and variations will be tested: S3 and NFS on RAID partition (each with 3 file formats: .sashdat, .sashdat with compression and .parquet) as well as SingleStore on EBS and Bottomless.
Two test datasets will be used: one simulation an analytical table (wide, common repetitions) and one raw input (larger, more unique observations ).
2.1. Engines
2.1.1. AWS S3
This option relies on Object Storage offered within Amazon Web Services: Simple Storage Service (S3). Its benefits include unlimited storage that doesn’t have to be preallocated and is dynamically resized. The cost is also attractive 0.023 down to 0.004 and data can be easily moved between the tiers to decrease the bill even further. It has to be noted though that there is a charge associated with accessing the data and it’s larger the colder the storage tier.
Another benefit is the lack of necessity for any additional software as CAS can access S3 directly.
2.1.2. NFS on RAID disks
Ana alternative is to use an in-custer NFS server that relies on throughput-optimized EBS volumes that are connected in a RAID partition. It may result in much higher performance and a slight increase in durability. What is more, if the data is used sparingly reads and writes may take advantage of bursting capabilities of ST1 volumes to achieve even better results limited only by an instances maximum bandwidth to EBS.
The downsides involve at least 1.5 times higher cost per GB compared to baseline hot storage and, more importantly, much more maintenance as well as reliance on components whose support policy may be different.
In our test scenario we will be using 4 ST1 EBS volumes 5TB each.
2.1.3. SAS Viya with SingleStore (S2)
As a third option let us consider using SingleStore. It uses its very own compression and data access mechanisms but is tightly integrated with SAS CAS for seamless use. It doesn’t rely on sashdat nor parquet so it will be particularly informative to test out this scenario.
It has to be kept in mind that S2 will need its own dedicated hardware as well as licenses. To use unlimited, bottomless storage SingleStore Premium is required which is roughly double the cost of a standard edition. Maintaining SingleStore will be an additional administrative task but not as time consuming as a RAID+NFS scenario.
In the test scenario we went contrary to SAS recommendation of using 1/3 of CAS vCPUs and decided to check the performance on full parity (1:1). Consequently we have 1 master node (4 vCPU / 32 GB RAM) and 4 leafs (8/64) with 5TB ST1 EBS each. 8 partition for the database. This will be labeled as S2PVC. Using S3 via Singlestore will be labeled S2BLESS.
2.2. Test datasets
2.2.1. Analytical Based Table (ABT).
It is used to simulate an input table for Machine Learning algorithms as well as exploration tables used typically in reporting. There are a lot of repetitive values in the columns which may help with compression. The code for generating the dataset is attached below:
%macro bigabtset(outname, rows);
%parallelize(&rows., 32);
data casuser.abt&outname.;
set casuser.rozrzucenie;
do i = 1 to _step;
segment = byte(rand("integer", 65, 90)); /* A - Z */
%do i = 1 %to 100;
var10_&i = %myrand(10);
%end;
%do i = 1 %to 100;
var100_&i = %myrand(100);
%end;
%do i = 1 %to 100;
var1k_&i = %myrand(1000);
%end;
%do i = 1 %to 100;
var10k_&i = %myrand(10000);
%end;
output;
end;
drop i _step;
run;
%partition(abt&outname.);
%mend;
2.2.2. Fact table
It is used to simulate a dataset that is an input fact table for further processing. It is loosely based on a test scenario. It has multiple double columns that won’t benefit much from compression. The code for generating the dataset is attached below:
data &lib..meters&outname.;
set casuser.rozrzucenie;
length numer_licznika $18 dzien 8 kierunek $1;
array h{24} 8 h1-h24; /* Array to hold hourly data */
/* Generate data for each meter */
do meter = 1 to _step;
numer_licznika = "000" + put(rand('unifrom')*1e15, z15.);
/* Generate data for each day */
do i = 0 to &num_days - 1;
dzien = &start_date + i;
/* Generate data for each direction */
%let direction_count = %sysfunc(countw(&directions));
%do d = 1 %to &direction_count;
kierunek = "%scan(&directions, &d)";
/* Generate hourly consumption */
do hour = 1 to 24;
h{hour} = rand("uniform") * 10;
end;
/* Output the row */
output;
%end;
end;
end;
drop i meter hour _step;
run;
3. Results
3.1. Scaling with the number of observations
Below are the charts that show the size of the datasets after compression for both of the datasets. The size of files did not differ depending on the underlying storage engine (RAID/S3).
** NOTE **
The size for S2 is the one reported by the database engine itself. The actual size of the S3 bucket in particular is much greater – up to 500%. We made an assumption that these can be significantly reduces via system settings but not attempted to do so after a quick search.
The charts below show how the time of read and write operations depend on the number of observations.
It appears that it’s safe to conclude that these number scale linearly. It’s an important observation for the future as we can perform benchmarking quicker and using less resources.
3.2. Compromise between cost and performance
The chart in this subchapter shows each of the proposed technology on two axis: the vertical one is cost per gigabyte and the horizontal one is the sum of time needed to save and load a dataset. The color of the bubble represents two different datasets.
Important assumptions:
The cost for S3 assumes 1TB of daily save and load
The cost of S2 excludes necessary hardware and licenses!
The performance of S2 is measured on a system with CAS vCPU = S2 vCPU.
The S2 performance assumes no logs are stored.
We can immediately see that ABT can be efficiently compressed using parquet and SingleStore. These two options seem to be dominant among the alternatives.
For the Meters dataset the situation is much more nuanced. We can see much worse performance of the NFSRAID engine with S2BLESS decrasing performance the least.
3.3. Operation types: saving and loading
The chart below shows the difference between the save and load operations. Please note the difference in scale for both axes.
A clear distinction between both of the measured operations is visible. It seems that depending on the particular use cases. For example, if a file is created once a month and loaded every day into memory the saving time is not as important. Conversely, if big datasets are created in case they are needed but oftentimes are deleted before being loaded than it's the load time that may be disregarded.
An important observation is that compression of sashdat files is an operation that greatly increases the time to save but significantly reduces the load time. On the other hand parquet favour saving time to loading but the difference is not so big.
3.4. Fairness check
To make sure all of the tested methods were actually putting the dataset for quick use in the CAS memory we performed an additional check. After loading we ran a quick MDSUMMARY action on all of the applicable variables to make sure that there are no outliers. The results are show below.
As you can see there are three group of observations: NFSRAID.sashdat which was a significant outlier but only when using the ABT files. The outlier status persisted regardless of the file size. I tried correcting it but to no avail. It’s particularly perplexing that this effect is absent in the METERS dataset and they both share the same code.
The second group is the compressed sashdat which is a lot slower than the rest of the file types excluding the previously mentioned outlier. This may have to do with the fact that the data can indeed fit without problems into the nodes memory. The result might have been different if the contrary had been true.
4. Conclusions
There are a number of conclusions that we may draw from the experiment:
Both the data size and the time of operations scale linearly in the tested scope.
The parquet filetype is not very suitable for preserving factual, unique observations.
Compression of sashdat files significantly reduces the time to load a dataset at the expense of saving time and calculations.
The number of saves vs. loads should be considered when choosing the storage type as these may vary greatly.
SingleStore seems to be a viable option for a mix of ABT and FACT tables and large data volumes and probably when it is already present in the system for other reasons – otherwise a careful calculation of at least the license and additional hardware cost should be performed.
5. Further testing
A number of topics also seem to be interesting as a hypotheses for future testing:
When does buying cheaper storage but larger CAS hardware make sense?
Testing SingleStore with recommended and not enlarged hardware vs. CAS.
Can compression on the filesystem level improve performance for NFSRAID?
How does different RAID types affect the performance and cost?
When does burst EBS performance make a difference?
Does CAS node size make difference? Should we prioritize the number of nodes or the size?
Is linear scaling preserved for files beyond the memory limitations of CAS nodes?
... View more