I wanted to share an issue (I guess I may be misunderstanding some concepts) that I’m facing with some benchmarks I’m doing to XFS setups, as we are going to recently migrate a service to a new instance and we would like to have the max. amount of IOPS possible.
We have a Gitolite instance that currently works with a 500GB io1 volume (25K IOPS), we would like to move this service to a new instance and I was considering the possibility of improving the underlying filesystem.
At this moment the filesystem the instance has it’s XFS on top of LVM on that single volume.
I have been doing some benchmarks on moving the service to an instance with:
- 8 volumes of 50GB – 2500IOPS each of those
These 8 volumes are included in the same LVM group in an stripped configuration. The commands I used to create this stripped setup are:
## Create the LVM PV's $ pvcreate /dev/nvme(12345678)n1 ## Create the volume group: $ vgcreate test_vol /dev/nvme(12345678)n1 ## Create the stripe configuration: $ lvcreate --extents 100%FREE --stripes 8 --stripesize 256 --name test test_vol ## XFS format the new volume: $ mkfs.xfs /dev/mapper/test_vol-root -f
And that should be it. Now, benchmarks.
Running this fio test over this virtual volume:
io --name randwrite --ioengine=libaio --iodepth=2 --rw=randwrite --bs=4k --size=400G --numjobs=8 --runtime=300 --group_reporting --filename=/test/testfile --fallocate=none
Shows the following report:
Jobs: 8 (f=8): (w(8))(100.0%)(w=137MiB/s)(w=35.1k IOPS)(eta 00m:00s) randwrite: (groupid=0, jobs=8): err= 0: pid=627615: Wed Nov 25 13:15:33 2020 write: IOPS=23.0k, BW=93.7MiB/s (98.2MB/s)(27.4GiB/300035msec); 0 zone resets slat (usec): min=2, max=132220, avg=141.07, stdev=2149.78 clat (usec): min=3, max=132226, avg=143.46, stdev=2150.25
Which is not bad at all, but executing the very same
fio benchmark on another instance with a single volume of 500GB (25K IOPS) shows:
Jobs: 8 (f=8): (w(8))(100.0%)(w=217MiB/s)(w=55.6k IOPS)(eta 00m:00s) randwrite: (groupid=0, jobs=8): err= 0: pid=11335: Wed Nov 25 12:54:57 2020 write: IOPS=48.2k, BW=188MiB/s (198MB/s)(55.2GiB/300027msec); 0 zone resets slat (usec): min=2, max=235750, avg=130.69, stdev=1861.69
Which is by far much better output than the stripped setup.
We are going to use this instance to host an internal Git server, so I was assuming than an stripped setup would be much better than an instance with a single volume, but those benchmarks show the best setup (in terms of IOPS/bandwidth) is the one with the single disk.
Am I assuming anything wrong? Will the stripped setup work better for random writers(ie. not running out of IOPS)