Question about propagating URLs with errors inside the federation service #1769
Labels
No labels
Blocked
Bug
Changelog
Added
Changelog
Missing
Changelog
None
Cherry-picking
Database
Dependencies
Dependencies/Renovate
Difficulty
Easy
Difficulty
Hard
Difficulty
Medium
Documentation
Enhancement
Good first issue
Help wanted
Inherited
Matrix/Administration
Matrix/Appservices
Matrix/Auth
Matrix/Client
Matrix/Core
Matrix/E2EE
Matrix/Federation
Matrix/Hydra
Matrix/MSC
Matrix/Media
Matrix/T&S
Merge
Merge/Manual
Merge/Squash
Meta
Meta/CI
Meta/Packaging
Priority
Blocking
Priority
High
Priority
Low
Security
Status
Confirmed
Status
Duplicate
Status
Invalid
Status
Needs Investigation
Support
Wont fix
old/ci/cd
old/rust
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
continuwuation/continuwuity#1769
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I was troubleshooting a federation issue between my continuwuity server and another server. Everything looked fine from the federation tester, but the root cause turned out to be only detectable from my own system. (The issue was that my firewall config was too restrictive and didn't allow outbound connections on port 8448, which surprisingly didn't cause me issues for over a month, but that's neither here nor there...)
The piece of information that would have helped me debug this 100x faster is knowing the fully resolved destination URL.
!admin debug pingis a great tool, but it doesn't show you the URL when things go wrong. Seeing this would have made me immediately realize that mycurltest from the box was not hitting the same URL; it worked fine on standard HTTPS port 443 ;)I traced the chain of function calls that would execute, down to the point of the error as follows:
conduwuit_admin::debug::commands::pingfunctionService::send_unauthenticated_requestin theconduwuit_service::sendingmoduleconduwuit_service::federation::execute::execute_unauthenticatedconduwuit_service::federation::execute::execute_on(actual destination resolved here, via the resolverget_actual_destcall)conduwuit_service::federation::execute::performconduwuit_service::federation::execute::handle_errorThe error that I saw from the debug ping was as follows:
The message is not very helpful because
handle_errorexplicitly scrubs the URL out:I'm not quite sure what the solution is here, which is why I'm opening an issue. The docstring for
without_urlsays that it's for security to prevent leaking secrets. This seems plausible on its face, but raises a few follow-up questions/observations:debug_error!)?I'd like to hear thoughts from the maintainers before working up a PR on whether this is an acceptable change or if we should go another path (and what that path might be).
I would also suggest that the
mutconsumingwith_Xandwithout_Xmethods are not particularly idiomatic Rust. This style reduces the utility of the methods compared to taking&selfand returning a copy with the field update. The Rust compiler is very good at ensuring this doesn't copy the whole struct every time ;) Cleaning that up would let us have nice, sanitized logs without mutating the underlying error (or doing an explicit clone every time).This can be fetched with
!admin debug resolve-true-destination, as well as from external federation testers.I did stumble upon
!admin debug resolve-true-destinationeventually, and that was one of the puzzle pieces that eventually led to me discovering the issue. I would have still identified the issue much faster if the debug ping output had more detail.I don't understand the pointer to external federation testers. Both of our servers show green on all federation testers that I'm aware of. This particular network pathology (firewall in front of continuwuity blocking outbound on 8448) is not detectable except from inside the network. All external servers can still send inbound requests, and continuwuity could send outbound on 443, but not 8448.
I'm saying you can use the external federation testers against the remote servers, in order to fetch their true destinations. However I get your point that the debug ping command could be more detailed
(Ideally, you shouldn't block any outbound port for Matrix anyways, as there are some servers running on other nonstandard ports)