In this article I would like to Compare performance of current popular file systems: BTRFS, XFS and EXT4 based on normal daily use cases for developers: no random writing on single file, lots of small files in build dir. What is the best choice? Do the choices vary when using different hardware: SATA SSD, NVME SSD, mechanic HDD?
The BTRFS also come with important features such as compression. However, compression of file system in most cases are nonsense nowadays. It won't save any space or increase IO performance. Because normal files in PC/laptop are all compressed already: (image, pdf, media, xls/doc, hdf, ccache, read-only database ...). I think only super large build directory benifits from file system compression. The git repo itself actually has compression already. Thus if the .git is way larger than source tree, it won't make too much sense to compress the entire source dir. Fortunately, the btrfs is smart enough to determine which files are worth compressing.
The result is rather surprising: the performance varied a lot depends on what kind of hardware I am using and what kind of data I have; the worst choices can be any of those. However, the best choice seems stick.
- write test:
cp -a <src> <dest> && sync - before write test I delete the old items in and
fstrimthe mount point. - tar read test:
tar -c <dest>/<data> | pv -f /dev/null - cp read test:
cp -a <dest> /tmp/<test root> - before each read test we remount the partition to drop the system cache
For simple benchmark, I used several ffmpeg and libboost build dirs as example: ffmpeg
LZO
Processed 1541 files, 20427 regular extents (20427 refs), 542 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 47% 1.2G 2.6G 2.6G
none 100% 306M 306M 306M
lzo 41% 970M 2.3G 2.3G
ZLIB
Processed 1541 files, 20399 regular extents (20399 refs), 632 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 36% 964M 2.6G 2.6G
none 100% 294M 294M 294M
zlib 28% 670M 2.3G 2.3G
ZSTD:1
Processed 1541 files, 20399 regular extents (20399 refs), 632 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 34% 917M 2.6G 2.6G
none 100% 294M 294M 294M
zstd 26% 622M 2.3G 2.3G
ZSTD:3
Processed 1541 files, 20399 regular extents (20399 refs), 632 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 34% 917M 2.6G 2.6G
none 100% 294M 294M 294M
zstd 26% 622M 2.3G 2.3G
libboost dir (149k/2GiB) contains lots of small files. The apparent size of du is 1.6GB, but the actual disk size is 2.0GB, implying that there are lots of file smaller than 4kiB:
LZO
Processed 148537 files, 60482 regular extents (60482 refs), 92581 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 37% 647M 1.6G 1.6G
none 100% 4.3M 4.3M 4.3M
lzo 37% 643M 1.6G 1.6G
ZLIB
Processed 148537 files, 59862 regular extents (59862 refs), 93201 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 24% 417M 1.6G 1.6G
none 100% 228K 228K 228K
zlib 24% 416M 1.6G 1.6G
ZSTD:1
Processed 148537 files, 59862 regular extents (59862 refs), 93201 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 22% 392M 1.6G 1.6G
none 100% 228K 228K 228K
zstd 22% 392M 1.6G 1.6G
ZSTD:3
Processed 148537 files, 59862 regular extents (59862 refs), 93201 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 22% 392M 1.6G 1.6G
none 100% 228K 228K 228K
zstd 22% 392M 1.6G 1.6G
So ZSTD compress the data pretty well. LZO provide less compression but still decent ~50%. I also noticed the ZLIB impose huge pressure on CPU and very slow. So in the following tests I ignored it.
I also tested the speed or the write/read via simple cp -a <source> <dest> on a Intel Core i7-9750H machine.
write test to file system on a same usb SSD (files are already read into memory):
| dest | time (sec) |
|---|---|
| xfs | 1.1 |
| btrfs | 0.97 |
| btrfs zstd:1 | 0.94 |
We can see the speed is almost the same. This indicates the penality on write performance is quite huge when zstd is on. Because it actually only write 1/3 of the data.
read test from above file system to a ram disk, system cache is dropped before each copy command.
| source | time (sec) | peak speed (M/s) |
|---|---|---|
| xfs | 7.7 | 366.1 |
| btrfs | 6.7 | 379.0 |
| btrfs zstd:1 | 2.9 | 338.9 |
peak speed is obtained via iostat. With zstd:1 the I/O is slower but not much comparing to the write operation. However, the big compression ratio compensate this and yield the fastest read speed. It is 130% faster than non-compressed condition, since the size is 1/3 of the original.
I did not test other combination or zlib, since the zlib is well-known slower than zstd:1
However, this test has a flaw that I do not sync after cp, thus the data may not finished the write-operation to the disk at all after cp command finished. In following more thourough tests, I use cp xxx && sync as write speed test.
/dev/sdd:480103981056B:scsi:512:512:gpt:INTEL SS DSC2BW480A4:;
1:17408B:1073741823B:1073724416B:free;
1:1073741824B:27917287423B:26843545600B:ext4:usb ssd test ext4:;
3:27917287424B:54760833023B:26843545600B:btrfs:usb ssd test btrfs:;
5:54760833024B:81604378623B:26843545600B:xfs:usb ssd test xfs:;
This disk has about 300MiB/s sequencial write speed and 500MiB/s sequencial read speed.
Writing 1.4k files in size of 2.7GiB on all file systems gave similar write speed as writing single big file. the non-compressed btrfs has a bit less write and read performance than xfs and ext4. Compressed btrfs give way better performance. zstd:3 is the best with 55% more write speed and 133% more read speed. It is followed by zstd:1 with 35%/133% increase. The LZO has less improvement but still give 43% more write speed and 79% more read speed.
| index | file system | disk usage | write time (s) | write MiB/s | write rate | tar to file read MiB/s | tar read rate | cp read MiB/s | cp read rate |
|---|---|---|---|---|---|---|---|---|---|
| 0 | btrfs.none | 0.000147705 | 9.40651 | 283.958 | 0.965568 | 365.276 | 1.07789 | 344.891 | 0.958941 |
| 5 | xfs.0 | 0.112771 | 9.38198 | 284.701 | 0.968093 | 341.466 | 1.00763 | 363.048 | 1.00942 |
| 2 | ext4.0 | 1.40477e-06 | 9.2048 | 290.181 | 0.986728 | 339.391 | 1.00151 | 360.402 | 1.00207 |
| 3 | xfs.1 | 0.00823492 | 9.1085 | 293.249 | 0.997159 | 338.185 | 0.997949 | 363.603 | 1.01097 |
| 6 | ext4.1 | 0.106864 | 8.96366 | 297.987 | 1.01327 | 338.369 | 0.998491 | 358.914 | 0.997932 |
| 1 | btrfs.zstd-1 | 0.104891 | 6.71203 | 397.95 | 1.35319 | 253.149 | 0.747018 | 838.609 | 2.33168 |
| 4 | btrfs.lzo | 0.141149 | 6.34291 | 421.109 | 1.43193 | 267.076 | 0.788114 | 643.236 | 1.78846 |
| 7 | btrfs.zstd-3 | 0.191489 | 5.83131 | 458.054 | 1.55756 | 250.175 | 0.738242 | 853.062 | 2.37187 |
Interestingly, tar read test yield completely different scheme: none compressed btrfs give the best reasult, all compressed btrfs give 30% drop in speed.
Writing 149k/2GB files on all file systems is way slower: ~30% of normal writing speed and ~10% of normal reading speed. XFS seems perform worst when many small files invovled. btrfs in general proivdes higher performance (> 18% write and 21% read improvement over ext4) in this case. Unlike the 1.4k-file test, all 3 compressed methods can yielded best write performance providing 35% more speed over ext4, from time to time. The non-compressed one is also quite similar to the best one.
The read speeds are different though. All 3 compressed ones have almost same reading speed with about 60% increase. But the non-compressed one is a bit slower with only 22% improvement over ext4. And in this case the tar read speed trend is the same as cp read speeds. Though tar has about 20% more speed when using compressed btrfs comparing to the cp command.
| index | file system | disk usage | write time (s) | write MiB/s | write rate | tar to file read MiB/s | tar read rate | cp read MiB/s | cp read rate |
|---|---|---|---|---|---|---|---|---|---|
| 4 | xfs.0 | 0.0879229 | 13.7472 | 118.552 | 0.914611 | 47.178 | 0.962071 | 47.7707 | 0.978131 |
| 1 | xfs.1 | 0.00823614 | 13.5625 | 120.166 | 0.927067 | 47.6865 | 0.97244 | 47.3504 | 0.969526 |
| 5 | ext4.0 | 0.079824 | 12.6361 | 128.977 | 0.995037 | 49.383 | 1.00704 | 49.2813 | 1.00906 |
| 2 | ext4.1 | 1.40477e-06 | 12.5113 | 130.263 | 1.00496 | 48.693 | 0.992965 | 48.3962 | 0.990938 |
| 6 | btrfs.lzo | 0.106342 | 11.269 | 144.622 | 1.11574 | 88.2133 | 1.79888 | 77.5018 | 1.58689 |
| 7 | btrfs.zstd-3 | 0.143783 | 10.3497 | 157.469 | 1.21485 | 85.4793 | 1.74312 | 80.4497 | 1.64725 |
| 3 | btrfs.none | 0.0261482 | 10.1429 | 160.68 | 1.23962 | 64.454 | 1.31437 | 59.4803 | 1.21789 |
| 0 | btrfs.zstd-1 | 0.000147705 | 9.30068 | 175.23 | 1.35187 | 92.5516 | 1.88734 | 79.568 | 1.6292 |
| index | file system | disk usage | write time (s) | write MiB/s | write rate | tar to file read MiB/s | tar read rate | cp read MiB/s | cp read rate |
|---|---|---|---|---|---|---|---|---|---|
| 1 | xfs.1 | 0.00823492 | 14.5515 | 111.999 | 0.873386 | 48.1493 | 0.992781 | 49.1477 | 1.02168 |
| 5 | xfs.0 | 0.0879229 | 13.7274 | 118.723 | 0.925814 | 46.7736 | 0.964414 | 48.3584 | 1.00527 |
| 2 | ext4.1 | 0.079824 | 12.9215 | 126.128 | 0.98356 | 48.5128 | 1.00027 | 48.6915 | 1.0122 |
| 0 | ext4.0 | 1.40477e-06 | 12.5035 | 130.344 | 1.01644 | 48.4862 | 0.999726 | 47.518 | 0.987803 |
| 6 | btrfs.zstd-1 | 0.117458 | 12.0904 | 134.798 | 1.05117 | 92.4985 | 1.90721 | 79.8751 | 1.66044 |
| 7 | btrfs.zstd-3 | 0.143814 | 10.358 | 157.342 | 1.22697 | 85.3323 | 1.75945 | 80.1055 | 1.66523 |
| 3 | btrfs.none | 0.000147705 | 9.4264 | 172.893 | 1.34824 | 65.216 | 1.34467 | 58.5836 | 1.21784 |
| 4 | btrfs.lzo | 0.0801518 | 9.07413 | 179.605 | 1.40058 | 86.4693 | 1.78289 | 76.8859 | 1.5983 |
I then test it on an almost fresh nvme SSD:
/dev/nvme0n1:512110190592B:nvme:512:512:gpt:INTEL SSDPEKNW512G8:;
6:296470183936B:350157275135B:53687091200B:xfs:linux-nvme-data:;
7:350157275136B:377000820735B:26843545600B:btrfs:btrfs test:;
8:377000820736B:403844366335B:26843545600B:ext4:ext4test:;
For a 2.7GiB ffmpeg build dir containing 1.5k files, the speeds are all > 500MB/s which is close to the single large file copy speed (~600-700MB/s). I test it several times with random shuffled test orders. The typical results are showing in following table. We can see lzo always has the fastest write speed, though the compression ratio is only 47%. The ZSTD:1 has much higher ratio of 34% however it seems to pay a big penality. Unlike lzo/zstd, ZLIB shows very high CPU usage and very slow in the prelimiary tests. So I did not include it in the repeated tests. The ext4 and xfs speeds yield from this test are kind of unstable, some times to drop to 100-200MB/s. But in most cases they are around 500-700MB/s
| file system | disk usage | write time (s) | write MiB/s | tar to file read MiB/s | cp read MiB/s |
|---|---|---|---|---|---|
| btrfs.zstd-1 | 0.401214 | 3.14575 | 849.1 | 489.592 | 1305.49 |
| btrfs.lzo | 0.437475 | 1.91317 | 1396.14 | 589.98 | 1280.35 |
| btrfs.none | 0.487829 | 3.29185 | 811.416 | 1216.99 | 1153.71 |
| ext4.0 | 0.40917 | 4.48444 | 595.628 | 1093.44 | 1213.11 |
| ext4.1 | 0.516032 | 4.4532 | 599.806 | 1100.46 | 1186.14 |
| xfs.0 | 0.0076032 | 3.88385 | 687.734 | 1190.83 | 1292.62 |
| xfs.1 | 0.0598712 | 5.31797 | 502.27 | 1182.8 | 1271.12 |
For a 2.0GiB boost build dir containing 149k files, the I/O speeds are far slower around 200MB/s in all cases. The btrfs.lzo and non-compressed btrfs are only slightly faster than ext4 and xfs during writing. But the zstd:1 is way slower than the rest, only 60-70% speed of non-compressed btrfs. The read speed of all btrfs tests are 20% faster than ext4 and xfs. This may indicate btrfs handle small files better. The compression ratio in this case does not affect the I/O much because the bottle neck is the ammount of files instead of size, though LZO and ZSTD:1 provide 37% and 22% ratios respectively.
| file system | disk usage | write time (s) | write MiB/s | tar to file read MiB/s | cp read MiB/s |
|---|---|---|---|---|---|
| btrfs.lzo | 0.401216 | 8.13217 | 200.408 | 163.573 | 124.486 |
| btrfs.none | 0.437983 | 7.60417 | 214.324 | 146.083 | 120.017 |
| btrfs.zstd-1 | 0.518031 | 13.0737 | 124.659 | 153.742 | 121.545 |
| ext4.0 | 0.40917 | 8.51435 | 191.413 | 96.7245 | 107.22 |
| ext4.1 | 0.488992 | 8.64948 | 188.422 | 96.3132 | 107.352 |
| xfs.0 | 0.0076032 | 8.57081 | 190.152 | 110.243 | 102.137 |
| xfs.1 | 0.0474472 | 8.57814 | 189.989 | 110.808 | 100.598 |
An interesting observation is that tar read speed and the cp read speed are all very different in both 1.5k/2.7GiB and 148k/2GiB tests. However, the effects are in a different direction. In 1.5k/2.7GiB case the LZO/ZSTD:1 show ~50% performance drop in tar-read comparing to cp tests of themselves and the tar tests of other file systems. This indicates the file reading methods of tar and cp have some fundamental difference and worth further study.
In 148k/2GiB case the compressed file systems do not show any performance drop in tar tests. However, tar tests of btrfs are ~25% faster than cp tests. The possible reason can be that large number of file creation operations in RAM disk is still costy.
/dev/sdc:5000981077504B:scsi:512:4096:gpt:Seagate One Touch HDD:;
1:17408B:107374182399B:107374164992B:free;
1:107374182400B:134217727999B:26843545600B:ext4:speed test ext4:;
2:134217728000B:161061273599B:26843545600B:xfs:speed test xfs:;
3:161061273600B:187904819199B:26843545600B:btrfs:speed test btrfs:;
./fs-user-benchmark.py --no-fstrim --source ~/.cget/cache/builds/ffmpeg -t '{"root_path":"/media/dracula/hddxfstest/test", "postfix":"0"}' -t '{"root_path":"/media/dracula/hddext4test/test", "postfix":"0"}' -t '{"root_path":"/media/dracula/hddxfstest/test", "postfix":"1"}' -t '{"root_path":"/media/dracula/hddext4test/test", "postfix":"1"}' -t '{"root_path":"/media/dracula/hddbtrfstest/test", "compress_type":"none"}' -t '{"root_path":"/media/dracula/hddbtrfstest/test", "compress_type":"lzo"}' -t '{"root_path":"/media/dracula/hddbtrfstest/test", "compress_type":"zstd:1"}' -t '{"root_path":"/media/dracula/hddbtrfstest/test", "compress_type":"zstd:3"}'
Here is typical results from 1.4k/2.7GB build folder. btrfs is faster than xfs and ext4 in general. ext4 is the slowest. zstd and lzo has huge advantage than none compressed btrfs. Because here the bottle neck is the disk I/O. Smaller data matter a lot. ZSTD:3 impose some pressure on CPU but does not give much advantage than ZSTD:1.
| file system | disk usage | write time (s) | write MiB/s | tar to file read MiB/s | cp read MiB/s |
|---|---|---|---|---|---|
| ext4.1 | 0.106865 | 32.4621 | 82.2824 | 76.1464 | 83.3907 |
| xfs.1 | 0.112776 | 25.8475 | 103.339 | 98.3266 | 99.6622 |
| ext4.0 | 0.213728 | 29.8597 | 89.4537 | 72.7791 | 79.1887 |
| xfs.0 | 0.217307 | 25.2035 | 105.98 | 97.814 | 99.3223 |
| btrfs.lzo | 0.000147705 | 15.1674 | 176.105 | 193.557 | 195.507 |
| btrfs.none | 0.0505009 | 22.912 | 116.579 | 116.823 | 118.195 |
| btrfs.zstd-1 | 0.155232 | 11.5432 | 231.397 | 183.332 | 251.207 |
| btrfs.zstd-3 | 0.191588 | 11.2101 | 238.273 | 170.086 | 297.061 |
Writing 149k/2GB files to mechanic HDD is extremely slow for xfs and non-compressed btrfs. The speed is as low as 1MB/s for btrfs almost impossible to use. zstd is 100x faster than non-compressed one. ext4 is 5x faster than xfs, 45x faster than btrfs. lzo in this case only have 50% speed of zstd:1. zstd:3 is 10% slower than zstd:1 too.
Reading from it is even slower. btrfs in general is much faser than ext4, the xfs is the slowest only 7MiB/s.
| file system | disk usage | write time (s) | write MiB/s | tar to file read MiB/s | cp read MiB/s |
|---|---|---|---|---|---|
| xfs.0 | 0.112778 | 171.254 | 9.51661 | 8.53608 | 7.72498 |
| btrfs.zstd-1 | 0.000147705 | 13.399 | 121.632 | 34.2228 | 30.4638 |
| ext4.0 | 0.106865 | 36.2998 | 44.8971 | 7.45604 | 11.5033 |
| xfs.1 | 0.192459 | 189.117 | 8.6177 | 8.44728 | 7.53305 |
| btrfs.none | 0.0261714 | 1394.19 | 1.16896 | 12.4196 | 20.2212 |
| ext4.1 | 0.186687 | 69.3157 | 23.5121 | 7.20049 | 11.8862 |
| btrfs.lzo | 0.10633 | 26.8329 | 60.7373 | 19.6596 | 23.9906 |
| btrfs.zstd-3 | 0.143793 | 14.9673 | 108.888 | 32.3755 | 22.2033 |
For normal build dirs:
| hardware | best | 2nd | 3rd | worst |
|---|---|---|---|---|
| USB3-SATA-SSD | btrfs zstd:1 | btrfs zstd:3 | btrfs lzo | btrfs |
| NVME-SSD | btrfs lzo | btrfs zstd:1 | btrfs | xfs |
| HDD | btrfs zstd:3 | btrfs zstd:1 | btrfs lzo | ext4 |
For build dirs contain large number of small files:
| hardware | best | 2nd | 3rd | worst |
|---|---|---|---|---|
| USB3-SATA-SSD | btrfs zstd:3 | btrfs lzo | btrfs zstd:1 | xfs |
| NVME-SSD | btrfs lzo | btrfs | ext4 | btrfs zstd:1 |
| HDD | btrfs zstd:1 | btrfs zstd:3 | btrfs lzo | btrfs |
- SATA-SSD: I recommend btrfs zstd:3. But any FS seems ok, though XFS is the worst choice.
- NVME-SSD: we should stick to btrfs lzo. It is way faster than the rest. However, others are still usable.
- HDD: I recommend btrfs zstd:1. We should definitely avoid btrfs here, because it is almost not usable at all: 1MiB/s in many-small-files case.
Though the many-small-files case yield very bad performance in general, the SSDs still perform far better than HDD. If I need to work on HDD, I will need to be very careful about where to put small files, definitely not on standard btrfs, though it claims to inline small files^1^.
zstd:1 still has huge penalty on I/O speed on modern PC system. The compression is way slower than decompression. The tests on relatively slow SATA SSD show no obvious because the bottle net is still disk I/O. On NVME SSD the penalty become far sound.
In other hand, if the drive is much slower such as HDD, usually only have less than 100MB/s. Then the benefit of smaller data size is far more beneficial than SSDs. Indeed, the degree of benefits completely depends on the information density of the data. If zstd/lzo can yield reasonable compression ratio (<50%) it might worth to turn on compression on slow storage devices.
LZO can give mild compression around 50% for most of my build dir. It requires way less CPU than ZSTD:1. Thus on fast internal SSDs, LZO is preferred.