FR: More helpful INFO logging during database open stage to detect corrupted db deadlock #1610

Closed
opened 2026-04-04 19:06:14 +00:00 by jinnatar · 4 comments

The problem

If the rocksdb is corrupted, it's possible that the database open will deadlock and never complete. This is extremely confusing to the admin given that the last log message they see is:

2026-04-04T18:58:42.815404Z INFO conduwuit::server: 0.5.6 (2c72338) server_name=miselli.fi database_path="/var/lib/continuwuity" log_levels=info

The next line should be:

2026-04-04T18:59:35.365336Z INFO main:start:open: conduwuit_database::engine::open: Opened database. columns=95 sequence=13075351 time=52.34851955s

.. And by not receiving that line, you're left in the dark about what's going on.

Impact

Extreme admin confusion. The service is "up" but tcp connect to all endpoints is failing and no errors are emitted in the log. Also your matrix homeserver is now down so you can't go ask the chat what the hell is going on. :-)

Proposed solution

Be slightly more verbose in startup logging by adding a new emitted INFO line, something like:

main:start:open: conduwuit_database::engine::open: Starting database open (be patient, but if this hangs for too long your rocksdb may be corrupted, see https://continuwuity.org/troubleshooting#database-corruption)

Obviously it would be nicer if db corruption didn't just deadlock startup in the first place, but that's a hard problem to fix, and verbosity would at least make it easier to figure out how screwed you are.

## The problem If the rocksdb is corrupted, it's possible that the database open will deadlock and never complete. This is extremely confusing to the admin given that the last log message they see is: > 2026-04-04T18:58:42.815404Z INFO conduwuit::server: 0.5.6 (2c72338) server_name=miselli.fi database_path="/var/lib/continuwuity" log_levels=info The next line should be: > 2026-04-04T18:59:35.365336Z INFO main:start:open: conduwuit_database::engine::open: Opened database. columns=95 sequence=13075351 time=52.34851955s .. And by not receiving that line, you're left in the dark about what's going on. ## Impact Extreme admin confusion. The service is "up" but tcp connect to all endpoints is failing and no errors are emitted in the log. Also your matrix homeserver is now down so you can't go ask the chat what the hell is going on. :-) ## Proposed solution Be slightly more verbose in startup logging by adding a new emitted INFO line, something like: > main:start:open: conduwuit_database::engine::open: Starting database open (be patient, but if this hangs for too long your rocksdb may be corrupted, see https://continuwuity.org/troubleshooting#database-corruption) Obviously it would be nicer if db corruption didn't just deadlock startup in the first place, but that's a hard problem to fix, and verbosity would at least make it easier to figure out how screwed you are.
nex self-assigned this 2026-04-05 18:05:28 +00:00
Owner

I think I wrote some additional logs for the startup process that may help identify when stuff gets stuck, I'll see if I can find it again soon

I think I wrote some additional logs for the startup process that may help identify when stuff gets stuck, I'll see if I can find it again soon
Owner

Actually, does #1494 not suit your needs?

Actually, does #1494 not suit your needs?
Author

Seems plausible it might help indeed. I guess I'll see once that hits a release and rocksdb dies the next time. :-)

Seems plausible it might help indeed. I guess I'll see once that hits a release and rocksdb dies the next time. :-)
Owner

I'll close this as resolved then, feel free to re-open after the next release if you think the issue still applies!

I'll close this as resolved then, feel free to re-open after the next release if you think the issue still applies!
nex closed this issue 2026-04-05 20:02:11 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
continuwuation/continuwuity#1610
No description provided.