sqlite – why not sqlite3 supports asynchronous IO

With PRAGMA synchronous=FULL, I can make sure the data will be persisted to disk at every operation. This is expected to slow down the execution. I suppose sqlite implements this behavior using wirte(2) + fsync(2).


However, the latency of fsync(2) depends on the underlying disk. For example, in some of my NVMe SSD (i.e., Samsung-980pro, WD-SN850), this latency is long:
experiment result with fio
This result is obtained by

fio --filename={/dev/nvme0n1, /dev/nvme1n1, /dev/sda} --runtime=15 --size=400G --direct=1 --buffered=0 --rw=randwrite --bs=4k --numjobs=16 --iodepth=1 --group_reporting --name=write --ioengine=sync --fsync={1, 0} --sync={0, 1}

As shown, these two NVMe SSD has longer latency than SATA SSD (Samsung-860evo). One possible reason for this may be NVMe SSD accelerate IO whose io-depth is large (e.g., 32) but when using the synchronous io, the io-depth will degenerate to exactly 1.


Given this observation, I find when I use PRAGMA synchronous=FULL, some workload (fillrandom in db_bench_sqlite3) runs slower in NVMe SSD (WD-SN850) than in SATA SSD (Intel-S4510)
experiment result with db_bench_sqlite3
Focus on the light-blue bar VS. dark-green bar when synchronous=FULL/EXTRA.
This behavior is more severe in Samsung-980pro (not drawn in the figure)

Possible Solution

when I change the fio parameter from synchronous io to libaio, something changes. Results. As shown, when using O_SYNC and write_aio(), NVMe SSDs become faster than SATA SSD as expected (not understand why write_aio() + fsync() is still problematic). So one possible solution for this problem is to make Sqlite3 support asynchronous IO like libaio (io_setup(2), io_destroy(2), io_submit(2), io_cancel(2), io_getevents(2)) or something new like io_uring.


The asynchronous IO is more friendly to the new NVMe storage devices. But why not SQLite3 does not support it.

There must be reasons why SQLite developers do not reply, I just want to figure it out.