Bad cqe data from IO uring errors persist after disabling direct IO. #1478
Labels
No labels
Blocked
Bug
Cherry-picking
Database
Dependencies
Dependencies/Renovate
Difficulty
Easy
Difficulty
Hard
Difficulty
Medium
Documentation
Enhancement
Good first issue
Help wanted
Inherited
Matrix/Administration
Matrix/Appservices
Matrix/Auth
Matrix/Client
Matrix/Core
Matrix/E2EE
Matrix/Federation
Matrix/Hydra
Matrix/MSC
Matrix/Media
Matrix/T&S
Meta
Meta/CI
Meta/Packaging
Priority
Blocking
Priority
High
Priority
Low
Security
Status
Confirmed
Status
Duplicate
Status
Invalid
Status
Needs Investigation
Support
To-Merge
Wont fix
old/ci/cd
old/rust
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
continuwuation/continuwuity#1478
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Short
After setting
rocksdb_optimize_for_spinning_disks = trueand/orrocksdb_direct_io = false, on OpenSUSE Leap 15.6 I continue seeingPosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5My impression is that disabling direct IO should stop these IO issues, but it does not.
Details
As discussed at https://matrix.to/#/!da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k/$ntX9mEofZPCaFHi2cPG3R4e3xvRSO3QzNT_bWZ0LnAc and https://matrix.to/#/!da26JtAjE6APGLnX8ncWsvc-skF2KQZ9Nw_MbNpYD2k/$UZNWRxGfB4Ml0_6Uqm-km2x6FXQEysR4q3LtmM28dDs
Most likely due to BTRFS's CoW and FS compression, I am getting
PosixRandomAccessFile::MultiRead: Bad cqe data from IO uring - 0xd5d5d5d5d5d5d5d5errors from RockDB.
This seems to additionally cause
WARN conduwuit_api::client::sync::v3::joined: timeline for newly joined room is emptyerrors, resembling #779
In an attempt to fix this, I set the
rocksdb_optimize_for_spinning_disks = trueandrocksdb_direct_io = falseconfig options.At this point I have tried the 3 permutations of this and I have confirmed with
!admin server show-configthat the settings apply.Additionally, I have used
chattr +Cand when transferring existing datacp --reflink=neverto disable CoW on the database file.Now I am also testing
btrfs property set <file> compression nonein a VM.In all cases, the Bad cqe data errors persist, with the exact same address logged.
Since discussing in the Continuwuity room I have been able to reproduce the issue in my testing VM.
Federation makes it spam these errors, but they still pop up with a single testing account.
And after these changes, starting with a fresh database also has no effect, other than clearing the accumulated brokenness.
Still To-Do
I am in the process of migrating my Andible playbooks for Leap 16.0 and will soon migrate my server over to that release.
As Leap 15.6 uses the Linux 6.4 kernel, an IO uring bug in this older release can plausibly cause these issues.
Leap 16.0 currently ships with Linux 6.12, if that is too old still I can investigate running Tumbleweed, which ships 6.19 at the moment.
On the filesystem side, I am considering testing an ISO file containing an Ext4 partition mounted for the database next.
For my actual server, I can install a PCIe to M.2 adaptor and dedicate a spare M.2 SSD to the database.
If all else fails I can deploy Synapse too, as I do not expect PostgreSQL to explode like RocksDB did here.
In Closing
I created this issue just to have something in the system for further investigation.
For now self-hosting a Matrix server can wait for me and I will not demand fixes from a community project either.
The issue might be with RocksDB or the older kernel and dependencies on Leap 15.6, I will update once I have my webserver config ported to 16.0.
But this next week I will be busy with uni assignments, so I do not know when I get around to this.
Thanks to the team for making this amazing software.