fix: Don't drop transactions with more than one PDU in a room #1711
No reviewers
Labels
No labels
Blocked
Bug
Changelog
Added
Changelog
Missing
Changelog
None
Cherry-picking
Database
Dependencies
Dependencies/Renovate
Difficulty
Easy
Difficulty
Hard
Difficulty
Medium
Documentation
Enhancement
Good first issue
Help wanted
Inherited
Matrix/Administration
Matrix/Appservices
Matrix/Auth
Matrix/Client
Matrix/Core
Matrix/E2EE
Matrix/Federation
Matrix/Hydra
Matrix/MSC
Matrix/Media
Matrix/T&S
Merge
Merge/Manual
Merge/Squash
Meta
Meta/CI
Meta/Packaging
Priority
Blocking
Priority
High
Priority
Low
Security
Status
Confirmed
Status
Duplicate
Status
Invalid
Status
Needs Investigation
Support
Wont fix
old/ci/cd
old/rust
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
continuwuation/continuwuity!1711
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "nex/fix/federation-txn-pdu-sorting"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
#1428 introduced logic that sorted incoming events before processing them when a remote server sends multiple PDUs for the same room in a single transaction. This was intended to help ensure events were processed in the correct order, leading to fewer chances of incorrect soft-fails and other similar race-related behaviours, but ultimately misinterpreted how the topological sorting function handled partial graphs, ultimately dropping all PDUs instead of sorting them. This pull request fixes that by changing how graph edges are interpreted, and also allows the origin server timestamp to be a tiebreaker.
Pull request checklist:
mainbranch, and the branch is named something other thanmain.myself, if applicable. This includes ensuring code compiles.
@ -305,2 +326,4 @@Ok((int!(0), MilliSecondsSinceUnixEpoch(ts)))}).await.inspect(|sorted| assert_eq!(debug_assert probably but otherwise lgtm
I was gonna argue that we want the failsafe in release too but realistically it shouldn't happen anymore so a debug assert is probably wisest