In this Juletip I will show you how you can combine multiple storage volumes into a single logical volume that you can format an mount in Linux.
Using this technique, I achieve a disk I/O throughput of more than 420-435 MB/s at the file system level using 4 disks compared to around 150-170 MB/s using a single disk.
In my examples, I will use Azure. However, this juletip works just as well on other cloud platforms such as AWS or GCP as well as on premises.
First, we need to attach the disks.
In Azure Portal, navigate to the VM and then click Disks
This will show the disks attached to your VM
In this case, I only have the OS disk attached.
We will now attach 5 new disks:
As we can see here, the single 400 GB disk comes with max IOPS of 2300 and max throughput of 150 MB/s.
So that is one way to add 400 GB storage.
We also add 4 disks of 100 GB each. These come with max IOPS of 500 and max throughput of 100 MB/s each.
While the aggregate IOPS of the 100 GB disks is slightly less than that of the single big disk (2000 vs 2300), the max throughput is significantly bigger (400 MB/s vs 150 MB/s).
This is exactly the tradeoff we are interested in when running a typical SAS workload because with SAS, raw throughput on sequential I/O is typically more important than IOPS.
Now let’s connect to the VM via SSH and login as root.
The command ‘lsblk’ lets us see attached disks:
~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 29.9G 0 part /
├─sda14 8:14 0 3M 0 part
└─sda15 8:15 0 124M 0 part /boot/efi
sdb 8:16 0 100G 0 disk
sdc 8:32 0 100G 0 disk
sdd 8:48 0 100G 0 disk
sde 8:64 0 100G 0 disk
Notice the 4 disks of 100 GB.
These are the ones that we will combine into a single logical volume.
Before we can do that, we need to create a partition table on the disks.
We do that using the cfdisk command:
~# cfdisk /dev/sdb
We will just create a single partition utilizing all the space on the disk:
Disk: /dev/sdb
Size: 100 GiB, 107374182400 bytes, 209715200 sectors
Label: gpt, identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38
Device Start End Sectors Size Type
>> /dev/sdb1 2048 209715166 209713119 100G Linux filesystem
┌─────────────────────────────────────────────────────────────────────────────┐
│Partition UUID: 9C66A040-5826-3147-BDDF-1456609BB07D │
│Partition type: Linux filesystem (0FC63DAF-8483-4772-8E79-3D69D8477DE4) │
└─────────────────────────────────────────────────────────────────────────────┘
[ Delete ] [ Resize ] [ Quit ] [ Type ] [ Help ] [ Write ]
[ Dump ]
The lsblk command will show us the partition is made:
~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 29.9G 0 part /
├─sda14 8:14 0 3M 0 part
└─sda15 8:15 0 124M 0 part /boot/efi
sdb 8:16 0 100G 0 disk
└─sdb1 8:17 0 100G 0 part
sdc 8:32 0 100G 0 disk
sdd 8:48 0 100G 0 disk
sde 8:64 0 100G 0 disk
We will now use sfdisk to copy the partition table onto the remaining 3 disks:
~# for disk in /dev/sd[c-e]; do sfdisk -d /dev/sdb | sfdisk --force $disk; done
Checking that no-one is using this disk right now ... OK
Disk /dev/sdc: 100 GiB, 107374182400 bytes, 209715200 sectors
Disk model: Virtual Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new GPT disklabel (GUID: FBFC3E10-FC1A-8647-96BD-618BBC662E38).
/dev/sdc1: Created a new partition 1 of type 'Linux filesystem' and of size 100 GiB.
/dev/sdc2: Done.
New situation:
Disklabel type: gpt
Disk identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38
Device Start End Sectors Size Type
/dev/sdc1 2048 209715166 209713119 100G Linux filesystem
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
Checking that no-one is using this disk right now ... OK
Disk /dev/sdd: 100 GiB, 107374182400 bytes, 209715200 sectors
Disk model: Virtual Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new GPT disklabel (GUID: FBFC3E10-FC1A-8647-96BD-618BBC662E38).
/dev/sdd1: Created a new partition 1 of type 'Linux filesystem' and of size 100 GiB.
/dev/sdd2: Done.
New situation:
Disklabel type: gpt
Disk identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38
Device Start End Sectors Size Type
/dev/sdd1 2048 209715166 209713119 100G Linux filesystem
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
Checking that no-one is using this disk right now ... OK
Disk /dev/sde: 100 GiB, 107374182400 bytes, 209715200 sectors
Disk model: Virtual Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new GPT disklabel (GUID: FBFC3E10-FC1A-8647-96BD-618BBC662E38).
/dev/sde1: Created a new partition 1 of type 'Linux filesystem' and of size 100 GiB.
/dev/sde2: Done.
New situation:
Disklabel type: gpt
Disk identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38
Device Start End Sectors Size Type
/dev/sde1 2048 209715166 209713119 100G Linux filesystem
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
We will verify using lsblk:
~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 29.9G 0 part /
├─sda14 8:14 0 3M 0 part
└─sda15 8:15 0 124M 0 part /boot/efi
sdb 8:16 0 100G 0 disk
└─sdb1 8:17 0 100G 0 part
sdc 8:32 0 100G 0 disk
└─sdc1 8:33 0 100G 0 part
sdd 8:48 0 100G 0 disk
└─sdd1 8:49 0 100G 0 part
sde 8:64 0 100G 0 disk
└─sde1 8:65 0 100G 0 part
Now we need to make these partitions available for LVM. We do that by adding them as physical volumes:
~# pvcreate /dev/sd[b-e]1
Physical volume "/dev/sdb1" successfully created.
Physical volume "/dev/sdc1" successfully created.
Physical volume "/dev/sdd1" successfully created.
Physical volume "/dev/sde1" successfully created.
We can list available physical volumes with the ‘pvs’ command:
~# pvs
PV VG Fmt Attr PSize PFree
/dev/sdb1 lvm2 --- <100.00g <100.00g
/dev/sdc1 lvm2 --- <100.00g <100.00g
/dev/sdd1 lvm2 --- <100.00g <100.00g
/dev/sde1 lvm2 --- <100.00g <100.00g
Our logical volume must reside on a volume group. So we will now create a volume group containing the newly created physical volumes:
~# vgcreate vg1 /dev/sd[b-e]1
Volume group "vg1" successfully created
We can list volume groups using the ‘vgs’ command:
~# vgs
VG #PV #LV #SN Attr VSize VFree
vg1 4 0 0 wz--n- 399.98g 399.98g
We are now ready to create the actual volume group. We will create it using 4 stripes each with a stripe size of 256k:
~# lvcreate -i 4 -I 256k -n lv1 -L 399G vg1
Logical volume "lv1" created.
We can list logical volumes using the ‘lvs’ command:
~# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv1 vg1 -wi-a----- 399.00g
Before we can mount and use the logical volume we need to format it with a file system. I am using the XFS filesystem:
~# mkfs.xfs /dev/vg1/lv1
meta-data=/dev/vg1/lv1 isize=512 agcount=16, agsize=6537152 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=0
data = bsize=4096 blocks=104594432, imaxpct=25
= sunit=64 swidth=256 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=51071, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
We are now ready to mount and use the logical volume:
~# mount /dev/vg1/lv1 /saswork/
Test write speed:
~# time head -c 200G /dev/zero | pv > /saswork/zero.dat
200GiB 0:07:50 [ 435MiB/s] [ <=> ]
real 7m50.776s
user 0m13.978s
sys 3m38.363s
We just wrote 200 GB data with a throughput of 435 MB/s. Not bad.
Let’s see how fast we can read the file.
First we drop all cache to ensure that we actually read the file from the storage layer and not just from file system cache in RAM.
~# sync ; echo 3 > /proc/sys/vm/drop_caches
Then we measure read throughput:
~# time pv /saswork/zero.dat > /dev/null
200GiB 0:08:00 [ 426MiB/s] [=================================>] 100%
real 8m0.308s
user 0m0.861s
sys 1m16.904s
We read the 200 GB file at 426 MB/s.
For comparison on a single storage volume, I was able to read and write at 153 MB/s and 167 MB/s respectively. (Data below)
The table below summarizes the results:
Throughput
Read
Write
Single volume
153 MB/s
167 MB/s
Logical volume striped over 4 volumes
426 MB/s
435 MB/s
Speed increase
178 %
160 %
Terminal log below:
(root@vm04) (2022-11-22 14:50:39) [0]
~# lvremove vg1
Do you really want to remove active logical volume vg1/lv1? [y/n]: y
Logical volume "lv1" successfully removed
(root@vm04) (2022-11-22 14:50:50) [0]
~# vgremove vg1
Volume group "vg1" successfully removed
(root@vm04) (2022-11-22 14:50:55) [0]
~# pvremove /dev/sd[b-e]1
Labels on physical volume "/dev/sdb1" successfully wiped.
Labels on physical volume "/dev/sdc1" successfully wiped.
Labels on physical volume "/dev/sdd1" successfully wiped.
Labels on physical volume "/dev/sde1" successfully wiped.
(root@vm04) (2022-11-22 14:51:13) [0]
~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 29.9G 0 part /
├─sda14 8:14 0 3M 0 part
└─sda15 8:15 0 124M 0 part /boot/efi
sdb 8:16 0 100G 0 disk
└─sdb1 8:17 0 100G 0 part
sdc 8:32 0 100G 0 disk
└─sdc1 8:33 0 100G 0 part
sdd 8:48 0 100G 0 disk
└─sdd1 8:49 0 100G 0 part
sde 8:64 0 100G 0 disk
└─sde1 8:65 0 100G 0 part
(root@vm04) (2022-11-22 14:51:19) [0]
~# mkfs.xfs /dev/sdc1
meta-data=/dev/sdc1 isize=512 agcount=4, agsize=6553535 blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1 bigtime=0
data = bsize=4096 blocks=26214139, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=12799, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Discarding blocks...Done.
(root@vm04) (2022-11-22 14:51:31) [0]
~# mount /dev/sdc1 /saswork/
(root@vm04) (2022-11-22 14:51:38) [0]
~# time head -c 90G /dev/zero | pv > /saswork/zero.dat
90.0GiB 0:09:09 [ 167MiB/s] [ <=>]
real 9m9.727s
user 0m7.200s
sys 1m53.040s
(root@vm04) (2022-11-22 15:01:05) [0]
~# sync ; echo 3 > /proc/sys/vm/drop_caches
(root@vm04) (2022-11-22 15:01:23) [0]
~# time pv /saswork/zero.dat > /dev/null
90.0GiB 0:10:01 [ 153MiB/s] [================================================================================>] 100%
real 10m1.976s
user 0m0.522s
sys 0m38.955s
... View more