Bad cqe data from IO uring errors persist after disabling direct IO. #1478

Open
opened 2026-03-01 13:17:17 +00:00 by WyvernDotRed · 5 comments

Short

After setting rocksdb_optimize_for_spinning_disks = true and/or rocksdb_direct_io = false, on OpenSUSE Leap 15.6 I continue seeing
PosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5
My impression is that disabling direct IO should stop these IO issues, but it does not.

Details

As discussed at https://matrix.to/#/!da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k/$ntX9mEofZPCaFHi2cPG3R4e3xvRSO3QzNT_bWZ0LnAc and https://matrix.to/#/!da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k/$UZNWRxGfB4Ml0_6Uqm-km2x6FXQEysR4q3LtmM28dDs
Most likely due to BTRFS's CoW and FS compression, I am getting
PosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5
errors from RockDB.
This seems to additionally cause
WARN conduwuit_api::client::sync::v3::joined: timeline for newly joined room is empty
errors, resembling #779

In an attempt to fix this, I set the rocksdb_optimize_for_spinning_disks = true and rocksdb_direct_io = false config options.
At this point I have tried the 3 permutations of this and I have confirmed with !admin server show-config that the settings apply.
Additionally, I have used chattr +C and when transferring existing data cp --reflink=never to disable CoW on the database file.
Now I am also testing btrfs property set <file> compression none in a VM.
In all cases, the Bad cqe data errors persist, with the exact same address logged.

Since discussing in the Continuwuity room I have been able to reproduce the issue in my testing VM.
Federation makes it spam these errors, but they still pop up with a single testing account.
And after these changes, starting with a fresh database also has no effect, other than clearing the accumulated brokenness.

Still To-Do

I am in the process of migrating my Andible playbooks for Leap 16.0 and will soon migrate my server over to that release.
As Leap 15.6 uses the Linux 6.4 kernel, an IO uring bug in this older release can plausibly cause these issues.
Leap 16.0 currently ships with Linux 6.12, if that is too old still I can investigate running Tumbleweed, which ships 6.19 at the moment.

On the filesystem side, I am considering testing an ISO file containing an Ext4 partition mounted for the database next.
For my actual server, I can install a PCIe to M.2 adaptor and dedicate a spare M.2 SSD to the database.
If all else fails I can deploy Synapse too, as I do not expect PostgreSQL to explode like RocksDB did here.

In Closing

I created this issue just to have something in the system for further investigation.
For now self-hosting a Matrix server can wait for me and I will not demand fixes from a community project either.
The issue might be with RocksDB or the older kernel and dependencies on Leap 15.6, I will update once I have my webserver config ported to 16.0.
But this next week I will be busy with uni assignments, so I do not know when I get around to this.

Thanks to the team for making this amazing software.

### Short After setting `rocksdb_optimize_for_spinning_disks = true` and/or `rocksdb_direct_io = false`, on OpenSUSE Leap 15.6 I continue seeing `PosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5` My impression is that disabling direct IO should stop these IO issues, but it does not. ### Details As discussed at https://matrix.to/#/!da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k/$ntX9mEofZPCaFHi2cPG3R4e3xvRSO3QzNT_bWZ0LnAc and https://matrix.to/#/!da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k/$UZNWRxGfB4Ml0_6Uqm-km2x6FXQEysR4q3LtmM28dDs Most likely due to BTRFS's CoW and FS compression, I am getting `PosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5` errors from RockDB. This seems to additionally cause `WARN conduwuit_api::client::sync::v3::joined: timeline for newly joined room is empty` errors, resembling https://forgejo.ellis.link/continuwuation/continuwuity/issues/779 In an attempt to fix this, I set the `rocksdb_optimize_for_spinning_disks = true` and `rocksdb_direct_io = false` config options. At this point I have tried the 3 permutations of this and I have confirmed with `!admin server show-config` that the settings apply. Additionally, I have used `chattr +C` and when transferring existing data `cp --reflink=never` to disable CoW on the database file. Now I am also testing `btrfs property set <file> compression none` in a VM. In all cases, the Bad cqe data errors persist, with the exact same address logged. Since discussing in the Continuwuity room I have been able to reproduce the issue in my testing VM. Federation makes it spam these errors, but they still pop up with a single testing account. And after these changes, starting with a fresh database also has no effect, other than clearing the accumulated brokenness. ### Still To-Do I am in the process of migrating my Andible playbooks for Leap 16.0 and will soon migrate my server over to that release. As Leap 15.6 uses the Linux 6.4 kernel, an IO uring bug in this older release can plausibly cause these issues. Leap 16.0 currently ships with Linux 6.12, if that is too old still I can investigate running Tumbleweed, which ships 6.19 at the moment. On the filesystem side, I am considering testing an ISO file containing an Ext4 partition mounted for the database next. For my actual server, I can install a PCIe to M.2 adaptor and dedicate a spare M.2 SSD to the database. If all else fails I can deploy Synapse too, as I do not expect PostgreSQL to explode like RocksDB did here. ### In Closing I created this issue just to have something in the system for further investigation. For now self-hosting a Matrix server can wait for me and I will not demand fixes from a community project either. The issue might be with RocksDB or the older kernel and dependencies on Leap 15.6, I will update once I have my webserver config ported to 16.0. But this next week I will be busy with uni assignments, so I do not know when I get around to this. Thanks to the team for making this amazing software.

My free oracle arm instance running ubuntu 22.04 has the same issue:

Mar 12 15:40:55 saneke.eu matrix-conduit[571649]: PosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5

and it's apparently running on ext4:

# cat /etc/fstab
LABEL=cloudimg-rootfs   /        ext4   defaults        0 1
LABEL=UEFI      /boot/efi       vfat    umask=0077      0 1

My previous installation was broken, not sure if it was the whole server, or just my user account, in any case I deleted the database, spun up a new server under the same domain and created the same user. I logged in with fluffychat and joined !da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k, then I logged in and verified element web and saw the issue pop up again, the error message is the very same as the installation before. And apparently the one in this issue, as well.

If you need me to test something you can ping @nacorid:vengeful.eu in any of the rooms you see me in, or mention me here, obviously.

My free oracle arm instance running ubuntu 22.04 has the same issue: ``` Mar 12 15:40:55 saneke.eu matrix-conduit[571649]: PosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5 ``` and it's apparently running on ext4: ``` # cat /etc/fstab LABEL=cloudimg-rootfs / ext4 defaults 0 1 LABEL=UEFI /boot/efi vfat umask=0077 0 1 ``` My previous installation was broken, not sure if it was the whole server, or just my user account, in any case I deleted the database, spun up a new server under the same domain and created the same user. I logged in with fluffychat and joined `!da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k`, then I logged in and verified element web and saw the issue pop up again, the error message is the very same as the installation before. And apparently the one in this issue, as well. If you need me to test something you can ping @nacorid:vengeful.eu in any of the rooms you see me in, or mention me here, obviously.
Author

Since I have managed to make Continuwuity work on my homeserver and as far as I can test in my pre-deployment testing VM.
My homeserver has been running for over a day while federating with a few servers, including continuwuity.org, without any uring database errors.
As I changed all the variables at once, I will list what I changed:

  • Reverted to the default setting of enabled rocksdb_direct_io and disabled rocksdb_optimize_for_spinning_disks.
  • Upgraded Continuwuity from v0.5.5 to v0.5.6.
  • Upgraded the OS from OpenSUSE Leap 15.6 to Leap 16.0.
    This is a kernel upgrade from Linux 6.4 to 6.12.
  • Added a dedicated NVMe SSD to my server, represented by a separate VirtIO disk in the testing VM.
    Which is formatted as Ext4, in the case of the server without a partition table, as a default 0 0 mount.
    The main filesystem with the errors being a BTRFS RAID1 pool across an MBR and partition table less SATA SSDs.

One or a combination of these factors made it work on my ancient x86_64_v2 BIOS only hardware.
I suspect that the OS update and/or the move from the BTRFS RAID1 pool were the fix, but these other factors could matter too.

Since I have managed to make Continuwuity work on my homeserver and as far as I can test in my pre-deployment testing VM. My homeserver has been running for over a day while federating with a few servers, including continuwuity.org, without any uring database errors. As I changed all the variables at once, I will list what I changed: - Reverted to the default setting of enabled rocksdb_direct_io and disabled rocksdb_optimize_for_spinning_disks. - Upgraded Continuwuity from v0.5.5 to v0.5.6. - Upgraded the OS from OpenSUSE Leap 15.6 to Leap 16.0. This is a kernel upgrade from Linux 6.4 to 6.12. - Added a dedicated NVMe SSD to my server, represented by a separate VirtIO disk in the testing VM. Which is formatted as Ext4, in the case of the server without a partition table, as a default 0 0 mount. The main filesystem with the errors being a BTRFS RAID1 pool across an MBR and partition table less SATA SSDs. One or a combination of these factors made it work on my ancient x86_64_v2 BIOS only hardware. I suspect that the OS update and/or the move from the BTRFS RAID1 pool were the fix, but these other factors could matter too.
Owner

I think it's highly likely that the OS upgrade fixed it - especially as the Ubuntu 22.04 instance on ext4 has the same issue. @nacorid perhaps you can try upgrading your Ubuntu version, or generally moving to a newer kernel?

I think it's highly likely that the OS upgrade fixed it - especially as the Ubuntu 22.04 instance on ext4 has the same issue. @nacorid perhaps you can try upgrading your Ubuntu version, or generally moving to a newer kernel?

I was running Linux 6.8, after upgrading to Ubuntu 24.04 I'm now running 6.17 and the error is gone.

I was running Linux 6.8, after upgrading to Ubuntu 24.04 I'm now running 6.17 and the error is gone.
Owner

OK, this can probably be added to the troubleshooting guide then.

OK, this can probably be added to the troubleshooting guide then.
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
continuwuation/continuwuity#1478
No description provided.