fix: Don't drop transactions with more than one PDU in a room #1711

Merged
nex merged 5 commits from nex/fix/federation-txn-pdu-sorting into main 2026-04-27 22:15:52 +00:00
Owner

#1428 introduced logic that sorted incoming events before processing them when a remote server sends multiple PDUs for the same room in a single transaction. This was intended to help ensure events were processed in the correct order, leading to fewer chances of incorrect soft-fails and other similar race-related behaviours, but ultimately misinterpreted how the topological sorting function handled partial graphs, ultimately dropping all PDUs instead of sorting them. This pull request fixes that by changing how graph edges are interpreted, and also allows the origin server timestamp to be a tiebreaker.

Pull request checklist:

  • This pull request targets the main branch, and the branch is named something other than
    main.
  • I have written an appropriate pull request title and my description is clear.
  • I understand I am responsible for the contents of this pull request.
  • I have followed the contributing guidelines:
#1428 introduced logic that sorted incoming events before processing them when a remote server sends multiple PDUs for the same room in a single transaction. This was intended to help ensure events were processed in the correct order, leading to fewer chances of incorrect soft-fails and other similar race-related behaviours, but ultimately misinterpreted how the topological sorting function handled partial graphs, ultimately dropping *all* PDUs instead of sorting them. This pull request fixes that by changing how graph edges are interpreted, and also allows the origin server timestamp to be a tiebreaker. <!-- Example: This pull request allows us to warp through time and space ten times faster than before by double-inverting the warp drive with hyperheated jump fluid, both making the drive faster and more efficient. This resolves the common issue where we have to wait more than 10 milliseconds to engage, use, and disengage the warp drive when travelling between galaxies. --> <!-- Closes: #... --> <!-- Fixes: #... --> <!-- Uncomment the above line(s) if your pull request fixes an issue or closes another pull request by superseding it. Replace `#...` with the issue/pr number, such as `#123`. --> **Pull request checklist:** <!-- You need to complete these before your PR can be considered. If you aren't sure about some, feel free to ask for clarification in #dev:continuwuity.org. --> - [x] This pull request targets the `main` branch, and the branch is named something other than `main`. - [x] I have written an appropriate pull request title and my description is clear. - [x] I understand I am responsible for the contents of this pull request. - I have followed the [contributing guidelines][c1]: - [x] My contribution follows the [code style][c2], if applicable. - [x] I ran [pre-commit checks][c1pc] before opening/drafting this pull request. - [x] I have [tested my contribution][c1t] (or proof-read it for documentation-only changes) myself, if applicable. This includes ensuring code compiles. - [x] My commit messages follow the [commit message format][c1cm] and are descriptive. <!-- Notes on these requirements: - While not required, we encourage you to sign your commits with GPG or SSH to attest the authenticity of your changes. - While we allow LLM-assisted contributions, we do not appreciate contributions that are low quality, which is typical of machine-generated contributions that have not had a lot of love and care from a human. Please do not open a PR if all you have done is asked ChatGPT to tidy up the codebase with a +-100,000 diff. - In the case of code style violations, reviewers may leave review comments/change requests indicating what the ideal change would look like. For example, a reviewer may suggest you lower a log level, or use `match` instead of `if/else` etc. - In the case of code style violations, pre-commit check failures, minor things like typos/spelling errors, and in some cases commit format violations, reviewers may modify your branch directly, typically by making changes and adding a commit. Particularly in the latter case, a reviewer may rebase your commits to squash "spammy" ones (like "fix", "fix", "actually fix"), and reword commit messages that don't satisfy the format. - Pull requests MUST pass the `Checks` CI workflows to be capable of being merged. This can only be bypassed in exceptional circumstances. If your CI flakes, let us know in matrix:r/dev:continuwuity.org. - Pull requests have to be based on the latest `main` commit before being merged. If the main branch changes while you're making your changes, you should make sure you rebase on main before opening a PR. Your branch will be rebased on main before it is merged if it has fallen behind. - We typically only do fast-forward merges, so your entire commit log will be included. Once in main, it's difficult to get out cleanly, so put on your best dress, smile for the cameras! --> [c1]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md [c2]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/docs/development/code_style.mdx [c1pc]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#pre-commit-checks [c1t]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#running-tests-locally [c1cm]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#commit-messages [n1]: https://towncrier.readthedocs.io/en/stable/tutorial.html#creating-news-fragments
fix: Don't consider out-of-scope nodes as prev events before sorting incoming events
Some checks failed
Auto Labeler / Apply labels based on changed files (pull_request_target) Successful in 31s
Checks / Prek / Check changed files (pull_request) Successful in 31s
Documentation / Build and Deploy Documentation (pull_request) Successful in 1m17s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 1m18s
Checks / Changelog / Check changelog is added (pull_request_target) Failing after 31s
Checks / Prek / Clippy and Cargo Tests (pull_request) Has been cancelled
0482bea74d
feat: Assert that no events were dropped during sorting
Some checks failed
Checks / Prek / Check changed files (pull_request) Successful in 7s
Checks / Changelog / Check changelog is added (pull_request_target) Failing after 34s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 1m20s
Documentation / Build and Deploy Documentation (pull_request) Successful in 1m33s
Checks / Prek / Clippy and Cargo Tests (pull_request) Failing after 5m6s
3de89df273
style: Simplify build_local_dag return
Some checks failed
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 8s
Checks / Prek / Check changed files (pull_request) Successful in 35s
Documentation / Build and Deploy Documentation (pull_request) Successful in 1m15s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 2m38s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 8m30s
197b489416
nex requested review from Owners 2026-04-27 22:02:32 +00:00
Jade approved these changes 2026-04-27 22:12:54 +00:00
@ -305,2 +326,4 @@
Ok((int!(0), MilliSecondsSinceUnixEpoch(ts)))
})
.await
.inspect(|sorted| assert_eq!(
Owner

debug_assert probably but otherwise lgtm

debug_assert probably but otherwise lgtm
Author
Owner

I was gonna argue that we want the failsafe in release too but realistically it shouldn't happen anymore so a debug assert is probably wisest

I was gonna argue that we want the failsafe in release too but realistically it shouldn't *happen* anymore so a debug assert is probably wisest
nex marked this conversation as resolved
style: Use debug assert instead of a normal assert
Some checks failed
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 8s
Checks / Prek / Pre-commit & Formatting (pull_request) Has been cancelled
Checks / Prek / Check changed files (pull_request) Has been cancelled
Documentation / Build and Deploy Documentation (pull_request) Has been cancelled
Checks / Prek / Clippy and Cargo Tests (pull_request) Has been cancelled
6abec29364
style: Clippy conflicts with cargo fmt, apparently
All checks were successful
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 8s
Checks / Prek / Check changed files (pull_request) Successful in 6s
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 1m25s
Documentation / Build and Deploy Documentation (pull_request) Successful in 1m33s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 8m34s
a0132a236e
nex merged commit f3fb218652 into main 2026-04-27 22:15:52 +00:00
nex deleted branch nex/fix/federation-txn-pdu-sorting 2026-04-27 22:15:52 +00:00
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
continuwuation/continuwuity!1711
No description provided.