BookmarkSubscribeRSS Feed
ThomasDamgaard
SAS Employee

 

In this Juletip I will show you how you can combine multiple storage volumes into a single logical volume that you can format an mount in Linux.

Using this technique, I achieve a disk I/O throughput of more than 420-435 MB/s at the file system level using 4 disks compared to around 150-170 MB/s using a single disk.

 

In my examples, I will use Azure. However, this juletip works just as well on other cloud platforms such as AWS or GCP as well as on premises.

 

 

 

 

 

First, we need to attach the disks.

 

In Azure Portal, navigate to the VM and then click Disks

ThomasDamgaard_0-1669879492127.png

 

 

This will show the disks attached to your VM

 

ThomasDamgaard_1-1669879492131.png

 

In this case, I only have the OS disk attached.

 

We will now attach 5 new disks:

 

ThomasDamgaard_2-1669879492132.png

 

 

 

As we can see here, the single 400 GB disk comes with max IOPS of 2300 and max throughput of 150 MB/s.

So that is one way to add 400 GB storage.

We also add 4 disks of 100 GB each. These come with max IOPS of 500 and max throughput of 100 MB/s each.

While the aggregate IOPS of the 100 GB disks is slightly less than that of the single big disk (2000 vs 2300), the max throughput is significantly bigger (400 MB/s vs 150 MB/s).

This is exactly the tradeoff we are interested in when running a typical SAS workload because with SAS, raw throughput on sequential I/O is typically more important than IOPS.

 

Now let’s connect to the VM via SSH and login as root.

 

The command ‘lsblk’ lets us see attached disks:

 

 

~# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda       8:0    0   30G  0 disk
├─sda1    8:1    0 29.9G  0 part /
├─sda14   8:14   0    3M  0 part
└─sda15   8:15   0  124M  0 part /boot/efi
sdb       8:16   0  100G  0 disk
sdc       8:32   0  100G  0 disk
sdd       8:48   0  100G  0 disk
sde       8:64   0  100G  0 disk

 

 

 

Notice the 4 disks of 100 GB.

 

These are the ones that we will combine into a single logical volume.

Before we can do that, we need to create a partition table on the disks.

We do that using the cfdisk command:

 

 

 ~# cfdisk /dev/sdb

 

 

 

We will just create a single partition utilizing all the space on the disk:

 

 

                                 Disk: /dev/sdb
              Size: 100 GiB, 107374182400 bytes, 209715200 sectors
          Label: gpt, identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38

    Device           Start          End      Sectors     Size Type
>>  /dev/sdb1         2048    209715166    209713119     100G Linux filesystem



 ┌─────────────────────────────────────────────────────────────────────────────┐
 │Partition UUID: 9C66A040-5826-3147-BDDF-1456609BB07D                         │
 │Partition type: Linux filesystem (0FC63DAF-8483-4772-8E79-3D69D8477DE4)      │
 └─────────────────────────────────────────────────────────────────────────────┘
     [ Delete ]  [ Resize ]  [  Quit  ]  [  Type  ]  [  Help  ]  [  Write ]
     [  Dump  ]

 

 

 

 

 

 

The lsblk command will show us the partition is made:

 

 

 

~# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda       8:0    0   30G  0 disk
├─sda1    8:1    0 29.9G  0 part /
├─sda14   8:14   0    3M  0 part
└─sda15   8:15   0  124M  0 part /boot/efi
sdb       8:16   0  100G  0 disk
└─sdb1    8:17   0  100G  0 part
sdc       8:32   0  100G  0 disk
sdd       8:48   0  100G  0 disk
sde       8:64   0  100G  0 disk

 

 

 

 

We will now use sfdisk to copy the partition table onto the remaining 3 disks:

 

 

 

~# for disk in /dev/sd[c-e]; do sfdisk -d /dev/sdb | sfdisk --force $disk; done
Checking that no-one is using this disk right now ... OK

Disk /dev/sdc: 100 GiB, 107374182400 bytes, 209715200 sectors
Disk model: Virtual Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new GPT disklabel (GUID: FBFC3E10-FC1A-8647-96BD-618BBC662E38).
/dev/sdc1: Created a new partition 1 of type 'Linux filesystem' and of size 100 GiB.
/dev/sdc2: Done.

New situation:
Disklabel type: gpt
Disk identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38

Device     Start       End   Sectors  Size Type
/dev/sdc1   2048 209715166 209713119  100G Linux filesystem

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
Checking that no-one is using this disk right now ... OK

Disk /dev/sdd: 100 GiB, 107374182400 bytes, 209715200 sectors
Disk model: Virtual Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new GPT disklabel (GUID: FBFC3E10-FC1A-8647-96BD-618BBC662E38).
/dev/sdd1: Created a new partition 1 of type 'Linux filesystem' and of size 100 GiB.
/dev/sdd2: Done.

New situation:
Disklabel type: gpt
Disk identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38

Device     Start       End   Sectors  Size Type
/dev/sdd1   2048 209715166 209713119  100G Linux filesystem

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
Checking that no-one is using this disk right now ... OK

Disk /dev/sde: 100 GiB, 107374182400 bytes, 209715200 sectors
Disk model: Virtual Disk
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Script header accepted.
>>> Created a new GPT disklabel (GUID: FBFC3E10-FC1A-8647-96BD-618BBC662E38).
/dev/sde1: Created a new partition 1 of type 'Linux filesystem' and of size 100 GiB.
/dev/sde2: Done.

New situation:
Disklabel type: gpt
Disk identifier: FBFC3E10-FC1A-8647-96BD-618BBC662E38

Device     Start       End   Sectors  Size Type
/dev/sde1   2048 209715166 209713119  100G Linux filesystem

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

 

 

We will verify using lsblk:

 

 

 

~# lsblk
NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda       8:0    0   30G  0 disk
├─sda1    8:1    0 29.9G  0 part /
├─sda14   8:14   0    3M  0 part
└─sda15   8:15   0  124M  0 part /boot/efi
sdb       8:16   0  100G  0 disk
└─sdb1    8:17   0  100G  0 part
sdc       8:32   0  100G  0 disk
└─sdc1    8:33   0  100G  0 part
sdd       8:48   0  100G  0 disk
└─sdd1    8:49   0  100G  0 part
sde       8:64   0  100G  0 disk
└─sde1    8:65   0  100G  0 part

 

 

 

 

Now we need to make these partitions available for LVM. We do that by adding them as physical volumes:

 

 

~# pvcreate /dev/sd[b-e]1
  Physical volume "/dev/sdb1" successfully created.
  Physical volume "/dev/sdc1" successfully created.
  Physical volume "/dev/sdd1" successfully created.
  Physical volume "/dev/sde1" successfully created.

 

 

 

We can list available physical volumes with the ‘pvs’ command:

 

 

~# pvs
  PV         VG Fmt  Attr PSize    PFree
  /dev/sdb1     lvm2 ---  <100.00g <100.00g
  /dev/sdc1     lvm2 ---  <100.00g <100.00g
  /dev/sdd1     lvm2 ---  <100.00g <100.00g
  /dev/sde1     lvm2 ---  <100.00g <100.00g

 

 

 

Our logical volume must reside on a volume group. So we will now create a volume group containing the newly created physical volumes:

 

 

 

~# vgcreate vg1 /dev/sd[b-e]1

  Volume group "vg1" successfully created

 

 

 

 

We can list volume groups using the ‘vgs’ command:

 

 

 

~# vgs

  VG  #PV #LV #SN Attr   VSize   VFree

  vg1   4   0   0 wz--n- 399.98g 399.98g

 

 

 

We are now ready to create the actual volume group. We will create it using 4 stripes each with a stripe size of 256k:

 

 

~# lvcreate -i 4 -I 256k -n lv1 -L 399G vg1

  Logical volume "lv1" created.

 

 

 

 

We can list logical volumes using the ‘lvs’ command:

 

 

~# lvs
  LV   VG  Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  lv1  vg1 -wi-a----- 399.00g

 

 

 

Before we can mount and use the logical volume we need to format it with a file system. I am using the XFS filesystem:

 

 

~# mkfs.xfs /dev/vg1/lv1

meta-data=/dev/vg1/lv1           isize=512    agcount=16, agsize=6537152 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=0
data     =                       bsize=4096   blocks=104594432, imaxpct=25
         =                       sunit=64     swidth=256 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=51071, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.

 

 

 

 

We are now ready to mount and use the logical volume:

 

 

 ~# mount /dev/vg1/lv1 /saswork/

 

 

 

Test write speed:

 

 

~# time head -c 200G /dev/zero | pv > /saswork/zero.dat
 200GiB 0:07:50 [ 435MiB/s] [                           <=>        ]

real    7m50.776s
user    0m13.978s
sys     3m38.363s

 

 

 

We just wrote 200 GB data with a throughput of 435 MB/s. Not bad.

Let’s see how fast we can read the file.

First we drop all cache to ensure that we actually read the file from the storage layer and not just from file system cache in RAM.

 

 

~# sync ; echo 3 > /proc/sys/vm/drop_caches

 

 

 

Then we measure read throughput:

 

 

~# time pv /saswork/zero.dat > /dev/null
 200GiB 0:08:00 [ 426MiB/s] [=================================>] 100%


real    8m0.308s
user    0m0.861s
sys     1m16.904s

 

 

 

We read the 200 GB file at 426 MB/s.

 

For comparison on a single storage volume, I was able to read and write at 153 MB/s and 167 MB/s respectively. (Data below)

The table below summarizes the results:

 

Throughput

Read

Write

Single volume

153 MB/s

167 MB/s

Logical volume striped over 4 volumes

426 MB/s

435 MB/s

Speed increase

178 %

160 %

 

 

Terminal log below:

 

(root@vm04) (2022-11-22 14:50:39) [0]

~# lvremove vg1

Do you really want to remove active logical volume vg1/lv1? [y/n]: y

  Logical volume "lv1" successfully removed

(root@vm04) (2022-11-22 14:50:50) [0]

~# vgremove vg1

  Volume group "vg1" successfully removed

(root@vm04) (2022-11-22 14:50:55) [0]

~# pvremove /dev/sd[b-e]1

  Labels on physical volume "/dev/sdb1" successfully wiped.

  Labels on physical volume "/dev/sdc1" successfully wiped.

  Labels on physical volume "/dev/sdd1" successfully wiped.

  Labels on physical volume "/dev/sde1" successfully wiped.

(root@vm04) (2022-11-22 14:51:13) [0]

~# lsblk

NAME    MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT

sda       8:0    0   30G  0 disk

├─sda1    8:1    0 29.9G  0 part /

├─sda14   8:14   0    3M  0 part

└─sda15   8:15   0  124M  0 part /boot/efi

sdb       8:16   0  100G  0 disk

└─sdb1    8:17   0  100G  0 part

sdc       8:32   0  100G  0 disk

└─sdc1    8:33   0  100G  0 part

sdd       8:48   0  100G  0 disk

└─sdd1    8:49   0  100G  0 part

sde       8:64   0  100G  0 disk

└─sde1    8:65   0  100G  0 part

(root@vm04) (2022-11-22 14:51:19) [0]

~# mkfs.xfs /dev/sdc1

meta-data=/dev/sdc1              isize=512    agcount=4, agsize=6553535 blks

         =                       sectsz=4096  attr=2, projid32bit=1

         =                       crc=1        finobt=1, sparse=1, rmapbt=0

         =                       reflink=1    bigtime=0

data     =                       bsize=4096   blocks=26214139, imaxpct=25

         =                       sunit=0      swidth=0 blks

naming   =version 2              bsize=4096   ascii-ci=0, ftype=1

log      =internal log           bsize=4096   blocks=12799, version=2

         =                       sectsz=4096  sunit=1 blks, lazy-count=1

realtime =none                   extsz=4096   blocks=0, rtextents=0

Discarding blocks...Done.

(root@vm04) (2022-11-22 14:51:31) [0]

~# mount /dev/sdc1 /saswork/

(root@vm04) (2022-11-22 14:51:38) [0]

~# time head -c 90G /dev/zero | pv > /saswork/zero.dat

90.0GiB 0:09:09 [ 167MiB/s] [       <=>]



real    9m9.727s

user    0m7.200s

sys     1m53.040s

(root@vm04) (2022-11-22 15:01:05) [0]

~# sync ; echo 3 > /proc/sys/vm/drop_caches

(root@vm04) (2022-11-22 15:01:23) [0]

~# time pv /saswork/zero.dat > /dev/null

90.0GiB 0:10:01 [ 153MiB/s] [================================================================================>] 100%



real    10m1.976s

user    0m0.522s

sys     0m38.955s

 

 

 

 

 

 

 

 

 

 

 

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Discussion stats
  • 0 replies
  • 1152 views
  • 13 likes
  • 1 in conversation