perf: Improve presence performance with application-layer cache #1534

Open
gamesguru wants to merge 4 commits from gamesguru/continuwuity:guru/backports/perf/presence-startup-costs-naive-loop-fix into main
Contributor

Tested this part of this with a bunch of other things a week or two back.

A very easy optimization to that will drastically reduce network activity and disk I/O.

perf: impl basic presence cache to avoid DB churn
chore: document tech debt/TODOs for presence optimizations
chore: add method for presence: `clear_cache()` for service mgr to call with rest on cache clear

This pull request...

Pull request checklist:

  • This pull request targets the main branch, and the branch is named something other than
    main.
  • I have written an appropriate pull request title and my description is clear.
  • I understand I am responsible for the contents of this pull request.
  • I have followed the contributing guidelines:
Tested ~~this~~ *part of this* with a bunch of other things a week or two back. A very easy optimization to that will drastically reduce network activity and disk I/O. ``` perf: impl basic presence cache to avoid DB churn chore: document tech debt/TODOs for presence optimizations chore: add method for presence: `clear_cache()` for service mgr to call with rest on cache clear ``` <!-- In order to help reviewers know what your pull request does at a glance, you should ensure that 1. Your PR title is a short, single sentence describing what you changed 2. You have described in more detail what you have changed, why you have changed it, what the intended effect is, and why you think this will be beneficial to the project. If you have made any potentially strange/questionable design choices, but didn't feel they'd benefit from code comments, please don't mention them here - after opening your pull request, go to "files changed", and click on the "+" symbol in the line number gutter, and attach comments to the lines that you think would benefit from some clarification. --> This pull request... <!-- Example: This pull request allows us to warp through time and space ten times faster than before by double-inverting the warp drive with hyperheated jump fluid, both making the drive faster and more efficient. This resolves the common issue where we have to wait more than 10 milliseconds to engage, use, and disengage the warp drive when travelling between galaxies. --> <!-- Closes: #... --> <!-- Fixes: #... --> <!-- Uncomment the above line(s) if your pull request fixes an issue or closes another pull request by superseding it. Replace `#...` with the issue/pr number, such as `#123`. --> **Pull request checklist:** <!-- You need to complete these before your PR can be considered. If you aren't sure about some, feel free to ask for clarification in #dev:continuwuity.org. --> - [x] This pull request targets the `main` branch, and the branch is named something other than `main`. - [x] I have written an appropriate pull request title and my description is clear. - [x] I understand I am responsible for the contents of this pull request. - I have followed the [contributing guidelines][c1]: - [x] My contribution follows the [code style][c2], if applicable. - [x] I ran [pre-commit checks][c1pc] before opening/drafting this pull request. - [ ] I have [tested my contribution][c1t] (or proof-read it for documentation-only changes) myself, if applicable. This includes ensuring code compiles. - [x] My commit messages follow the [commit message format][c1cm] and are descriptive. - [ ] I have written a [news fragment][n1] for this PR, if applicable<!--(can be done after hitting open!)-->. <!-- Notes on these requirements: - While not required, we encourage you to sign your commits with GPG or SSH to attest the authenticity of your changes. - While we allow LLM-assisted contributions, we do not appreciate contributions that are low quality, which is typical of machine-generated contributions that have not had a lot of love and care from a human. Please do not open a PR if all you have done is asked ChatGPT to tidy up the codebase with a +-100,000 diff. - In the case of code style violations, reviewers may leave review comments/change requests indicating what the ideal change would look like. For example, a reviewer may suggest you lower a log level, or use `match` instead of `if/else` etc. - In the case of code style violations, pre-commit check failures, minor things like typos/spelling errors, and in some cases commit format violations, reviewers may modify your branch directly, typically by making changes and adding a commit. Particularly in the latter case, a reviewer may rebase your commits to squash "spammy" ones (like "fix", "fix", "actually fix"), and reword commit messages that don't satisfy the format. - Pull requests MUST pass the `Checks` CI workflows to be capable of being merged. This can only be bypassed in exceptional circumstances. If your CI flakes, let us know in matrix:r/dev:continuwuity.org. - Pull requests have to be based on the latest `main` commit before being merged. If the main branch changes while you're making your changes, you should make sure you rebase on main before opening a PR. Your branch will be rebased on main before it is merged if it has fallen behind. - We typically only do fast-forward merges, so your entire commit log will be included. Once in main, it's difficult to get out cleanly, so put on your best dress, smile for the cameras! --> [c1]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md [c2]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/docs/development/code_style.mdx [c1pc]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#pre-commit-checks [c1t]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#running-tests-locally [c1cm]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#commit-messages [n1]: https://towncrier.readthedocs.io/en/stable/tutorial.html#creating-news-fragments
perf: address presence timer/DB inefficiencies
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been cancelled
Checks / Prek / Pre-commit & Formatting (pull_request) Has been cancelled
Checks / Prek / Clippy and Cargo Tests (pull_request) Has been cancelled
5a87474708
perf: impl basic presence cache to avoid DB churn

chore: document tech debt/TODOs for presence optimizations

chore: add method for presence: `clear_cache()` for service mgr to call with rest on cache clear
batch presence updates per server, not room. bundle events and local users under batch requests per a smart policy.
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 3m21s
Update flake hashes / update-flake-hashes (pull_request) Successful in 57s
Checks / Prek / Clippy and Cargo Tests (pull_request) Failing after 21m25s
60d37b038d
simplify my refactor of service manager startup logic [performance]

update TODO regarding presence cache and DB reads [low-priority]
nex requested changes 2026-03-13 21:35:45 +00:00
nex left a comment
Owner

The changes to src/service/services.rs seem unnecessary, otherwise I see no issue with this.

The changes to `src/service/services.rs` seem unnecessary, otherwise I see no issue with this.
nex changed title from perf: address presence timer/DB inefficiencies to perf: Improve presence performance with application-layer cache 2026-03-13 21:39:35 +00:00
Jade left a comment
Owner

I've only reviewed down to src/service/presence/data.rs

I've only reviewed down to `src/service/presence/data.rs`
@ -342,1 +342,4 @@
[workspace.dependencies.moka]
version = "0.12"
default-features = false
Owner
Pending response: https://matrix.to/#/!ksTlboXVgcyWjv5GrlEeKyQuJ8ZCprnwQx2b6-BQ44Q/$-gNHwsWoHouljwZl11tL0BOYy0rKBOg4qB0tMEOG0ls?via=gingershaped.computer&via=matrix.org&via=continuwuity.org
@ -343,0 +343,4 @@
[workspace.dependencies.moka]
version = "0.12"
default-features = false
features = ["sync"]
Owner

Introducing blocking code in an async context is not allowed. Please see the Moka docs for their async implementation.

Introducing blocking code in an async context is not allowed. Please see the Moka docs for their async implementation.
@ -15,2 +16,4 @@
presenceid_presence: Arc<Map>,
userid_presenceid: Arc<Map>,
cache: MokaCache<OwnedUserId, (u64, Presence)>,
locks: MutexMap<OwnedUserId, ()>,
Owner

I am very wary of this - RocksDB should already read from memory in systems with enough spare, and introducing more in memory locks opens the door for performance or deadlock issues.

I would prefer if there was at least some menchmarking or comparison to show if this additional complexity is worth it.

I am very wary of this - RocksDB should already read from memory in systems with enough spare, and introducing more in memory locks opens the door for performance or deadlock issues. I would prefer if there was at least some menchmarking or comparison to show if this additional complexity is worth it.
Contributor

@Jade wrote in #1534 (comment):

I would prefer if there was at least some menchmarking or comparison to show if this additional complexity is worth it.

s/menchmarking/benchmarking

@Jade wrote in https://forgejo.ellis.link/continuwuation/continuwuity/pulls/1534#issuecomment-26233: > I would prefer if there was at least some menchmarking or comparison to show if this additional complexity is worth it. s/menchmarking/benchmarking
@ -130,6 +160,10 @@ impl Data {
}
pub(super) async fn remove_presence(&self, user_id: &UserId) {
let _lock = self.locks.lock(user_id).await;
Owner

This lock is held to the end of the scope, but that is implicit. Either use an explicit drop or a closure to make sure the scope of the lock is clear. Iirc this is in the code style guide.

This lock is held to the end of the scope, but that is implicit. Either use an explicit drop or a closure to make sure the scope of the lock is clear. Iirc this is in the code style guide.
@ -186,1 +222,4 @@
}
#[cfg(test)]
mod tests {
Owner

What are you even testing? Most of these are intrinsic properties of lower levels of the system, and shouldn't be here.

What are you even testing? Most of these are intrinsic properties of lower levels of the system, and shouldn't be here.
Author
Contributor

I'm actually going to review all the comments before even testing.

I'm not as optimistic as nexy here, I think this could cause problems if the presence dates have a bug in it... idk, worst case all of a sudden your whole timeline gets borked. I've seen weird stuff on the MSC3030 branch

I'm actually going to review all the comments before even testing. I'm not as optimistic as nexy here, I think this could cause problems if the presence dates have a bug in it... idk, worst case all of a sudden your whole timeline gets borked. I've seen weird stuff on the MSC3030 branch
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 3m21s
Required
Details
Update flake hashes / update-flake-hashes (pull_request) Successful in 57s
Checks / Prek / Clippy and Cargo Tests (pull_request) Failing after 21m25s
Required
Details
This pull request has changes requested by an official reviewer.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u guru/backports/perf/presence-startup-costs-naive-loop-fix:gamesguru-guru/backports/perf/presence-startup-costs-naive-loop-fix
git switch gamesguru-guru/backports/perf/presence-startup-costs-naive-loop-fix
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
continuwuation/continuwuity!1534
No description provided.