docs: Add performance tuning guide #1498

Open
stratself wants to merge 16 commits from stratself/continuwuity:stratself/docs-perf-tuning into main
Contributor

This PR adds a page for performance tuning. As it is WIP feedback is highly appreciated. If you have any other perf tuning tips, please share :)

Will update _meta.json later after future rebasing. DNS tuning will be another page and in another PR.

Some todos:

- [ ] Add section for maxperf builds can be in a future PR
- [ ] Sysctl tunings should be recommended, and link to good guides on it. Find good guides.

Some questions to answer, ideally with real-world examples:

  • Should *_capacity_mb params be documented? Or is tweaking the modifier good enough
  • Any helpful gains from rocksdb_parallelism_threads and rocksdb_direct_io tunings?
  • Does sender_workers help with anything?

Pull request checklist:

  • This pull request targets the main branch, and the branch is named something other than
    main.
  • I have written an appropriate pull request title and my description is clear.
  • I understand I am responsible for the contents of this pull request.
  • I have followed the contributing guidelines:
This PR adds a page for performance tuning. As it is WIP feedback is highly appreciated. If you have any other perf tuning tips, please share :) Will update `_meta.json` later after future rebasing. DNS tuning will be another page and in another PR. Some todos: ~~- [ ] Add section for maxperf builds~~ can be in a future PR ~~- [ ] Sysctl tunings should be recommended, and link to good guides on it. Find good guides.~~ Some questions to answer, ideally with real-world examples: - Should `*_capacity_mb` params be documented? Or is tweaking the modifier good enough - Any helpful gains from `rocksdb_parallelism_threads` and `rocksdb_direct_io` tunings? - Does `sender_workers` help with anything? **Pull request checklist:** <!-- You need to complete these before your PR can be considered. If you aren't sure about some, feel free to ask for clarification in #dev:continuwuity.org. --> - [x] This pull request targets the `main` branch, and the branch is named something other than `main`. - [x] I have written an appropriate pull request title and my description is clear. - [x] I understand I am responsible for the contents of this pull request. - I have followed the [contributing guidelines][c1]: - [ ] My contribution follows the [code style][c2], if applicable. - [x] I ran [pre-commit checks][c1pc] before opening/drafting this pull request. - [x] I have [tested my contribution][c1t] (or proof-read it for documentation-only changes) myself, if applicable. This includes ensuring code compiles. - [x] My commit messages follow the [commit message format][c1cm] and are descriptive. - [ ] I have written a [news fragment][n1] for this PR, if applicable<!--(can be done after hitting open!)-->. <!-- Notes on these requirements: - While not required, we encourage you to sign your commits with GPG or SSH to attest the authenticity of your changes. - While we allow LLM-assisted contributions, we do not appreciate contributions that are low quality, which is typical of machine-generated contributions that have not had a lot of love and care from a human. Please do not open a PR if all you have done is asked ChatGPT to tidy up the codebase with a +-100,000 diff. - In the case of code style violations, reviewers may leave review comments/change requests indicating what the ideal change would look like. For example, a reviewer may suggest you lower a log level, or use `match` instead of `if/else` etc. - In the case of code style violations, pre-commit check failures, minor things like typos/spelling errors, and in some cases commit format violations, reviewers may modify your branch directly, typically by making changes and adding a commit. Particularly in the latter case, a reviewer may rebase your commits to squash "spammy" ones (like "fix", "fix", "actually fix"), and reword commit messages that don't satisfy the format. - Pull requests MUST pass the `Checks` CI workflows to be capable of being merged. This can only be bypassed in exceptional circumstances. If your CI flakes, let us know in matrix:r/dev:continuwuity.org. - Pull requests have to be based on the latest `main` commit before being merged. If the main branch changes while you're making your changes, you should make sure you rebase on main before opening a PR. Your branch will be rebased on main before it is merged if it has fallen behind. - We typically only do fast-forward merges, so your entire commit log will be included. Once in main, it's difficult to get out cleanly, so put on your best dress, smile for the cameras! --> [c1]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md [c2]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/docs/development/code_style.mdx [c1pc]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#pre-commit-checks [c1t]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#running-tests-locally [c1cm]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#commit-messages [n1]: https://towncrier.readthedocs.io/en/stable/tutorial.html#creating-news-fragments
Contributor

Remind me to review this when you're done writing up the other sections pretty please

Remind me to review this when you're done writing up the other sections pretty please
Henry-Hiles requested changes 2026-03-06 17:09:06 +00:00
Dismissed
@ -0,0 +48,4 @@
allow_incoming_typing = false
```
Presence is also considered expensive and is disabled by default. For reference, you can also disable them manually as follows:
Contributor

Isn't only outgoing presence off by default?

Isn't only outgoing presence off by default?
Author
Contributor

Separated outgoing_* from incoming_* and local_* stuff, highlighting the former as being more important.

Separated `outgoing_*` from `incoming_*` and `local_* ` stuff, highlighting the former as being more important.
nex requested changes 2026-03-06 17:20:55 +00:00
nex left a comment

Provisional notes, feel free to disregard if unwanted

Provisional notes, feel free to disregard if unwanted
@ -0,0 +1,80 @@
# Performance tuning
While Continuwuity's default config parameters are optimized for a small instance, they would likely need additional modifications to smoothly run in a larger context. This is especially true for homeservers with many users and/or are joined in many large federated rooms, and will increasingly be the case as the Matrix network expands.
Owner

This isn't necessarily true. You will only need to tune the default parameters if your hardware configuration does not scale with your server size (works in both directions - throwing a supercomputer at a tiny instance will waste resources, while throwing a potato at a moderately sized server will make it choke itself).

This isn't necessarily true. You will only need to tune the default parameters if your hardware configuration does not scale with your server size (works in both directions - throwing a supercomputer at a tiny instance will waste resources, while throwing a potato at a moderately sized server will make it choke itself).
Author
Contributor

Changed introduction to frame it as a "get more juice out of your orange" situation.

Changed introduction to frame it as a "get more juice out of your orange" situation.
@ -0,0 +19,4 @@
## Changing database compression algorithm
:::warning
This step should be done **before** starting Continuwuity for the first time
Owner

should -> MUST - afaik the algo can't be changed after init. Also worth mentioning it can't be reversed.

`should` -> `MUST` - afaik the algo *can't* be changed after init. Also worth mentioning it can't be reversed.
stratself marked this conversation as resolved
@ -0,0 +27,4 @@
```toml
### in continuwuity.toml ###
rocksdb_compression_algo = "lz4"
rocksdb_wal_compression = "none"
Owner

I'm pretty sure this significantly increases storage usage.

I'm pretty sure this *significantly* increases storage usage.
Owner

Should *_capacity_mb params be documented? Or is tweaking the modifier good enough

Tweaking the modifier is fine for a guide. Ideally we don't get too nitty-gritty on tuning in the docs otherwise people will break their stuff and make it our problem.

Does sender_workers help with anything?

Since it already scales to the number of available CPUs, not really, especially since senders aren't particularly CPU heavy (in comparison to state resolution etc).

> Should `*_capacity_mb` params be documented? Or is tweaking the modifier good enough Tweaking the modifier is fine for a guide. Ideally we don't get too nitty-gritty on tuning in the docs otherwise people *will* break their stuff and make it our problem. > Does `sender_workers` help with anything? Since it already scales to the number of available CPUs, not really, especially since senders aren't particularly CPU heavy (in comparison to state resolution etc).
@ -0,0 +5,4 @@
This page aims to outline various performance tweaks for Continuwuity and their effects. As always, your mileage may vary according to your setup's specifics. If you have further discussions or recommendations, please share them in the community rooms.
## DNS tuning (recommended)
Contributor

A separate DNS tuning guide sounds good, I would still include a paragraph on why that is needed in general in this documente. Something like:
"Matrix homeservers conduct MANY DNS queries, sometimes 10s of thousands within a few minutes. Normal tools, such as system-resolved are not designed for this load, and upstream DNS providers will often rate-limit you for making so many queries."

A separate DNS tuning guide sounds good, I would still include a paragraph on why that is needed in general in this documente. Something like: "Matrix homeservers conduct MANY DNS queries, sometimes 10s of thousands within a few minutes. Normal tools, such as system-resolved are not designed for this load, and upstream DNS providers will often rate-limit you for making so many queries."
Author
Contributor

DNS tuning guide now in #1601

DNS tuning guide now in #1601
stratself force-pushed stratself/docs-perf-tuning from 2404ddada0
All checks were successful
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Update flake hashes / update-flake-hashes (pull_request) Successful in 1m43s
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 3m2s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 12m31s
to 2728e81503
All checks were successful
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 2m58s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 8m51s
2026-03-10 11:27:19 +00:00
Compare
Author
Contributor

I have added a section on bottommost compression, although I don't really enable it myself and knows nobody to compare its benefits against. Please check and see if it works according to your knowledges and experiences. If it is too problematic, it can be removed.

I realized the file is .md. It will be changed to .mdx once everything is stabilized (or when the maintainers want me to).

I have added a section on bottommost compression, although I don't really enable it myself and knows nobody to compare its benefits against. Please check and see if it works according to your knowledges and experiences. If it is too problematic, it can be removed. I realized the file is `.md`. It will be changed to `.mdx` once everything is stabilized (or when the maintainers want me to).
@ -0,0 +14,4 @@
If you have unused memory to spare, consider increasing the `cache_capacity_modifier` value to a larger number, as to allow more data to be stored in hot memory. This would _significantly_ speed up many intensive operations such as state resolutions, and also results in decreased CPU usage and disk I/O. Start with a baseline of `cache_capacity_modifier = 2.0` and tune up until you find a satisfactory RAM usage.
On the other hand, if your system doesn't have a lot of RAM, consider decreasing the cache capacity modifier to something smaller than `1.0` to avoid low-memory issues (at the cost of higher load on disk/CPU). The recommendation also works if your system has very few RAM compared to the number of cores, as cache capacities tend to scale according to the latter.
Owner

suggest:
very little RAM compared to the number of CPU cores, as cache capacities tend to scale according to number of cores.

suggest: very little RAM compared to the number of CPU cores, as cache capacities tend to scale according to number of cores.
stratself marked this conversation as resolved
@ -0,0 +28,4 @@
allow_incoming_typing = false
```
Outgoing presence updates is also considered expensive and is disabled by default(`allow_local_presence = false`).
Owner

what is meant with allow local presence reference here? outgoing presence is now off by default, but local presence is enabled -> https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/conduwuit-example.toml#L1137-L1166

what is meant with allow local presence reference here? outgoing presence is now off by default, but local presence is enabled -> https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/conduwuit-example.toml#L1137-L1166
Author
Contributor

That was supposed to be allow_outgoing_presence mb

That was supposed to be `allow_outgoing_presence` mb
stratself marked this conversation as resolved
stratself force-pushed stratself/docs-perf-tuning from 5b67ab66be
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Check Changelog / Check for changelog (pull_request_target) Successful in 11s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 2m57s
Update flake hashes / update-flake-hashes (pull_request) Successful in 2m50s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 14m17s
to 58c2cd02c5
Some checks failed
Check Changelog / Check for changelog (pull_request_target) Successful in 14s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 3m7s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 14m36s
2026-03-31 20:01:30 +00:00
Compare
stratself changed title from WIP: docs: Add performance tuning guide to docs: Add performance tuning guide 2026-03-31 20:26:02 +00:00
Author
Contributor
Preview: https://muc.muoi.me/advanced/performance.html
docs(perf): Rewrite notary tuning stuff and add section on HTTP/3
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Check Changelog / Check for changelog (pull_request_target) Successful in 11s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 3m2s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 12m52s
6ac46ced73
Some more wording fixes too
chore(docs): Add performance tuning navigation
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Check Changelog / Check for changelog (pull_request_target) Successful in 11s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 3m7s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 13m41s
6d308f1723
Henry-Hiles requested changes 2026-04-04 14:13:13 +00:00
Dismissed
@ -0,0 +10,4 @@
## Cache capacities
If you have unused memory to spare, consider increasing the `cache_capacity_modifier` value to a larger number, as to allow more data to be stored in hot memory. This would _**significantly**_ speed up many intensive operations such as state resolutions, and also results in decreased CPU usage and disk I/O. Start with a baseline of `cache_capacity_modifier = 2.0` and tune up until you find a satisfactory RAM usage.
Contributor
- This would _**significantly**_ speed
+ This *significantly* speeds
```diff - This would _**significantly**_ speed + This *significantly* speeds ```
stratself marked this conversation as resolved
@ -0,0 +16,4 @@
## Disabling some features
You can disable outgoing **typing notifications** and **read markers** to reduce strain on the CPU and network.
Contributor

IMO it's worth mentioning incoming and local presence here too.

IMO it's worth mentioning incoming and local presence here too.
Author
Contributor

I did mention presence in the next paragraph or so, where I mentioned how to disable incoming/local of everything. If you think the structure can be improved lemme know how

I did mention presence in the next paragraph or so, where I mentioned how to disable incoming/local of everything. If you think the structure can be improved lemme know how
Author
Contributor

Resolved in #1498 (comment)

Resolved in https://forgejo.ellis.link/continuwuation/continuwuity/pulls/1498#issuecomment-27351
stratself marked this conversation as resolved
@ -0,0 +25,4 @@
allow_outgoing_typing = false
```
Outgoing presence updates is also considered expensive and has been disabled by default(`allow_outgoing_presence = false`).
Contributor

s/is/are
s/has/have

s/is/are s/has/have
stratself marked this conversation as resolved
@ -0,0 +102,4 @@
### Using UNIX sockets
If your homeserver and the reverse proxy lives on the same machine, you may consider exposing Continuwuity on a UNIX socket instead of a port. This would reduce TCP overhead between the two programs.
Contributor
- If your homeserver and the reverse proxy lives on the same machine, you may consider exposing Continuwuity on a UNIX socket instead of a port. This would reduce TCP overhead between the two programs.
+ If your homeserver and reverse proxy lives on the same machine, you may wish to expose Continuwuity on a UNIX socket instead of a port. This reduces TCP overhead between the two programs.
```diff - If your homeserver and the reverse proxy lives on the same machine, you may consider exposing Continuwuity on a UNIX socket instead of a port. This would reduce TCP overhead between the two programs. + If your homeserver and reverse proxy lives on the same machine, you may wish to expose Continuwuity on a UNIX socket instead of a port. This reduces TCP overhead between the two programs. ```
stratself marked this conversation as resolved
@ -0,0 +143,4 @@
### Serving .well-knowns manually
Instead of [reverse proxying .well-knowns](./delegation#serving-with-a-reverse-proxy), you can serve them directly as manual files at the reverse proxy. This could decrease _some_ network request handling for Continuwuity.
Contributor

Hmm is this worth it? Not sure. Lets get feedback from others on this.

Hmm is this worth it? Not sure. Lets get feedback from others on this.
Author
Contributor

Moved to real delegation docs at #1626

Moved to real delegation docs at #1626
stratself marked this conversation as resolved
@ -0,0 +182,4 @@
HTTP/3 support is mostly beneficial for faster Client-Server connections, especially in browser-based applications like Element or Cinny. Continuwuity includes experimental _outbound_ HTTP/3 support in its Docker images, so connections between Continuwuity servers can benefit from this too.
### Increasing file descriptors
Contributor

Do we want to recommend touching limits.conf or sysctl.conf? I'd ask for further feedback from others on this, but personally I'd lean to not recommending this.

Do we want to recommend touching `limits.conf` or `sysctl.conf`? I'd ask for further feedback from others on this, but personally I'd lean to not recommending this.
Author
Contributor

Traefik compose files suggested it is necessary, but I removed it in #1594 in this commit. In normal use cases I rarely see file descriptors > 1024, but I'm a singleuser instance so...

I think I'll remove it + sysctls entirely, as these have ambiguous effects, and can be discussed in Techtopic or something

[Traefik compose files](https://continuwuity.org/deploying/docker#with-traefik-included) suggested it is necessary, but I removed it in #1594 in this [commit](https://forgejo.ellis.link/continuwuation/continuwuity/commit/d61ed33f54e1fdfcfb50714ed2f3921a637438f4). In normal use cases I rarely see file descriptors > 1024, but I'm a singleuser instance so... I think I'll remove it + sysctls entirely, as these have ambiguous effects, and can be discussed in Techtopic or something
stratself marked this conversation as resolved
fix: Grammar + wording for perftuning page from feedback
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 2m52s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 12m46s
bc25223bc3
@ -0,0 +1,183 @@
# Performance tuning
While Continuwuity's default config parameters are generally optimised, additional modifications can be made to better utilise your server resources. This is especially helpful for homeservers with many users and/or are joined in many large federated rooms, and will increasingly be the case as the Matrix network expands.
Contributor

The second clause of the first sentence reads poorly. Drop "utilise" for "use" and consider rephrasing, especially since "additional modifications" doesn't really make sense here. Consider "you can adjust them to make better use of your server resources."

For the second sentence, try "This is especially helpful for homeservers that are in many large federated rooms or have many users, and it will become increasingly necessary as the Matrix network expands."

The second clause of the first sentence reads poorly. Drop "utilise" for "use" and consider rephrasing, especially since "additional modifications" doesn't really make sense here. Consider "you can adjust them to make better use of your server resources." For the second sentence, try "This is especially helpful for homeservers that are in many large federated rooms or have many users, and it will become increasingly necessary as the Matrix network expands."
Author
Contributor

let's continue on #1498 (comment)

let's continue on https://forgejo.ellis.link/continuwuation/continuwuity/pulls/1498#issuecomment-27406
stratself marked this conversation as resolved
@ -0,0 +10,4 @@
## Cache capacities
If you have unused memory to spare, consider increasing the `cache_capacity_modifier` value to a larger number, as to allow more data to be stored in hot memory. This *significantly* speed up many intensive operations such as state resolutions, and also results in decreased CPU usage and disk I/O. Start with a baseline of `cache_capacity_modifier = 2.0` and tune up until you find a satisfactory RAM usage.
Contributor

Replace ", as to" with "to". Replace "speed" with "speeds". Properly fence the "such as state resolutions" clause with a preceding comma. Drop "also". Consider "and decreases CPU usage and disk I/O". "baseline" should be a single word. Consider "and tune up until you are satisfied with RAM usage", since "find a satisfactory RAM usage" reads strangely.

Replace ", as to" with "to". Replace "speed" with "speeds". Properly fence the "such as state resolutions" clause with a preceding comma. Drop "also". Consider "and decreases CPU usage and disk I/O". "baseline" should be a single word. Consider "and tune up until you are satisfied with RAM usage", since "find a satisfactory RAM usage" reads strangely.
stratself marked this conversation as resolved
@ -0,0 +12,4 @@
If you have unused memory to spare, consider increasing the `cache_capacity_modifier` value to a larger number, as to allow more data to be stored in hot memory. This *significantly* speed up many intensive operations such as state resolutions, and also results in decreased CPU usage and disk I/O. Start with a baseline of `cache_capacity_modifier = 2.0` and tune up until you find a satisfactory RAM usage.
On the other hand, if your system doesn't have a lot of RAM, consider decreasing the cache capacity modifier to something smaller than `1.0` to avoid low-memory issues (at the cost of higher load on disk/CPU). The recommendation also works if your system has very little RAM compared to the number of CPU cores, as cache capacities tend to scale according to number of cores.
Contributor

Replace "The recommendation" with "This recommendation". I'm going to need you to elaborate on the last part in order to make a recommendation for it. What constitutes a small ratio of RAM to CPU cores? I assume this scaling is linear?

Replace "The recommendation" with "This recommendation". I'm going to need you to elaborate on the last part in order to make a recommendation for it. What constitutes a small ratio of RAM to CPU cores? I assume this scaling is linear?
stratself marked this conversation as resolved
@ -0,0 +60,4 @@
### Changing the compression algorithm
For reduced CPU usage at a tradeoff of increased storage space, consider deploying Continuwuity with the faster and less intensive `lz4` algorithm instead of `zstd` for rocksdb, and disable WAL compression entirely:
Contributor

I would personally like to see data on the differences made by this change before including this. I am concerned about advising people to save a negligible amount of processing time if it makes a great difference to storage use.

I would personally like to see data on the differences made by this change before including this. I am concerned about advising people to save a negligible amount of processing time if it makes a great difference to storage use.
Owner

not sure I'd recommend no compression. that will be huge. lz4 for both maybe.

not sure I'd recommend no compression. that will be huge. lz4 for both maybe.
Author
Contributor

WAL compression do not support lz4, I've been using none rather than zstd

WAL compression do not support lz4, I've been using `none` rather than zstd
@ -0,0 +68,4 @@
rocksdb_wal_compression = "none"
```
The tweak can especially be helpful if you have an older or less performant CPU (e.g. a Raspberry Pi) and disk space to spare.
Contributor

Replace "The tweak" with "This tweak".

Replace "The tweak" with "This tweak".
stratself marked this conversation as resolved
@ -0,0 +72,4 @@
### Increasing bottommost layer compression (`zstd` only)
The bottommost layer of the database usually contains old and read-only data, and hence is a suitable place for further compression. In Continuwuity, this is possible by setting `rocksdb_bottommost_compression = true` and tuning `rocksdb_bottommost_compression_level` to a more compact level than the default one used in `rocksdb_compression_level`. The tweak comes at a cost of some increased CPU usage, but would prevent your database from growing too large especially in the long run.
Contributor

Replace "and hence is" with "so it is". Replace "The tweak" with "This tweak". Consider "This tweak comes at the cost of increased CPU usage, but it may prevent your database from growing too large, especially on the long run."

Replace "and hence is" with "so it is". Replace "The tweak" with "This tweak". Consider "This tweak comes at the cost of increased CPU usage, but it may prevent your database from growing too large, especially on the long run."
stratself marked this conversation as resolved
@ -0,0 +86,4 @@
For `lz4` users, the default level (`-1`) is already the most compact. You can only further decrease it to favor compression speed over ratio.
Consult these documentations for more information on compression tuning and levels:
Contributor

Replace "documentations" with "documents". "Documentation" does not have a plural.

Replace "documentations" with "documents". "Documentation" does not have a plural.
stratself marked this conversation as resolved
@ -0,0 +102,4 @@
### Using UNIX sockets
If your homeserver and reverse proxy lives on the same machine, you may wish to expose Continuwuity on a UNIX socket instead of a port. This reduces TCP overhead between the two programs.
Contributor

Replace "lives" with "live" since you're talking about multiple entities ("it lives" vs "they live").

Replace "lives" with "live" since you're talking about multiple entities ("it lives" vs "they live").
stratself marked this conversation as resolved
@ -0,0 +131,4 @@
### Tuning your trusted servers
Trusted servers are queried sequentially starting from the first entry of `trusted_servers`. If you have multiple notaries configured, put the faster ones first:
Contributor

You refer to trusted servers twice in the same sentence here. Consider "Trusted servers are queried sequentially in the order they are listed."

You refer to trusted servers twice in the same sentence here. Consider "Trusted servers are queried sequentially in the order they are listed."
stratself marked this conversation as resolved
@ -0,0 +137,4 @@
trusted_servers = ["fastest.example.com","faster.example.com","matrix.org"]
```
Avoid using `matrix.org` as your primary notary, as it tends to be quite slow. If you need suggestions for trusted servers, ask in the Continuwuity main room.
Contributor

It would be better to have a list here, otherwise you're opening the main room up to people asking for notaries on a regular basis.

It would be better to have a list here, otherwise you're opening the main room up to people asking for notaries on a regular basis.
Author
Contributor

Depends on #1208. Status quo is to have people ask

Depends on #1208. Status quo is to have people ask
Author
Contributor
See https://forgejo.ellis.link/continuwuation/continuwuity/pulls/1498#issuecomment-27405
stratself marked this conversation as resolved
docs(perf): Grammar/wording edits from feedback
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 3m7s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 13m16s
1c359a0c14
Also combined the introductory paragraphs into one
nex requested changes 2026-04-05 13:16:21 +00:00
nex left a comment

Looking good, just some technical comments and phrasing suggestions

Looking good, just some technical comments and phrasing suggestions
@ -0,0 +1,181 @@
# Performance tuning
This page aims to outline various performance tweaks for Continuwuity and their effects. These adjustments are especially helpful for homeservers that are in many large federated rooms or have many users, and it will become increasingly necessary as the Matrix network expands. As always, your mileage may vary according to your setup's specifics. If you have further discussions or recommendations, please share them in the community rooms.
Owner

I feel a mention that the default configuration usually scales appropriately with your hardware so most people won't need this

I feel a mention that the default configuration usually scales appropriately with your hardware so most people won't need this
Owner

"usually" being "in most standard configurations", the outliers being abnormal configurations like 64 cores + 2GiB RAM, or 512GiB RAM with a dual core Celeron

"usually" being "in most standard configurations", the outliers being abnormal configurations like 64 cores + 2GiB RAM, or 512GiB RAM with a dual core Celeron
Author
Contributor

Added example abnormal RAM configuration in the caching section. Will find a way to include the first comment.

Added example abnormal RAM configuration in the caching section. Will find a way to include the first comment.
Author
Contributor

Incorporated the first comment.

Incorporated the first comment.
stratself marked this conversation as resolved
@ -0,0 +10,4 @@
If you have unused memory to spare, consider increasing the `cache_capacity_modifier` value to a larger number to allow more data to be stored in hot memory. This *significantly* speed up many intensive operations (such as state resolutions) and decreases CPU usage and disk I/O. Start with a baseline of `cache_capacity_modifier = 2.0` and tune up until you are satisfied with RAM usage.
On the other hand, if your system doesn't have a lot of RAM, consider decreasing the cache capacity modifier to something smaller than `1.0` to avoid low-memory issues (at the cost of higher load on disk/CPU). This recommendation also works if your system has very little RAM compared to the number of CPU cores, as cache capacities tend to scale according to number of cores.
Owner
- as cache capacities tend to scale according to number of cores.
+ as cache capacities scale according to the number of available cores.
```diff - as cache capacities tend to scale according to number of cores. + as cache capacities scale according to the number of available cores. ```
stratself marked this conversation as resolved
@ -0,0 +14,4 @@
## Disabling some features
You can disable outgoing **typing notifications** and **read markers** to reduce strain on the CPU and network.
Owner

When?

- reduce strain on the CPU and network. 
+ reduce strain on the CPU and network when actively participating in rooms. 
When? ```diff - reduce strain on the CPU and network. + reduce strain on the CPU and network when actively participating in rooms. ```
stratself marked this conversation as resolved
@ -0,0 +42,4 @@
allow_outgoing_typing = false
allow_incoming_typing = false
# disabling presence updates entirely
Owner

Local typing and read receipts have virtually no performance impact since they're just db rows, local presence only has an impact at all because poor spec choices mean clients will ping it every few seconds even when the user isn't present and this will cycle the sync loops of every user they share a room with.
Generally speaking, disabling local typing and read receipts will have no noticeable impact, definitely not one that outweighs the UX cost

Local typing and read receipts have virtually no performance impact since they're just db rows, local presence only has an impact at all because poor spec choices mean clients will ping it every few seconds even when the user isn't present and this will cycle the sync loops of every user they share a room with. Generally speaking, disabling local typing and read receipts will have no noticeable impact, definitely not one that outweighs the UX cost
Author
Contributor

I will only recommend disabling all kinds of presence

I will only recommend disabling all kinds of presence
nex marked this conversation as resolved
@ -0,0 +53,4 @@
## Tuning database compression
:::warning
These steps MUST be done **before** starting Continuwuity for the first time, as database compressions are irreversible.
Owner
- as database compressions are irreversible.
+ as the compressional algorithm cannot be changed after the database is created.
```diff - as database compressions are irreversible. + as the compressional algorithm cannot be changed after the database is created. ```
Author
Contributor

For the record, I have yet to find definite proof that this is the case. I also have swapped the algo and compression level after-the-fact and found no problem. Maybe c10y only use these vars on database creation.

I'll still include the line as to avoid potential footguns for other users, but do hope someone can fully confirm what's up.

For the record, I have yet to find definite proof that this is the case. I also have swapped the algo and compression level after-the-fact and found no problem. Maybe c10y only use these vars on database creation. I'll still include the line as to avoid potential footguns for other users, but do hope someone can fully confirm what's up.
Owner

https://github.com/facebook/rocksdb/wiki/Compression the setting is applied at creation time to each sst, so you can have a collection of compression modes being used at once. when a new level for a column family is written out (rebuild or compaction) it will be the default compression setting at that time ... which may be different. this gentle rebuild over time strategy is why you don't see immediate benefit to a compression settings change unless you force a global compaction (which kinda just rebuilds all the cf's).

https://github.com/facebook/rocksdb/wiki/Compression the setting is applied at creation time to each sst, so you can have a collection of compression modes being used at once. when a new level for a column family is written out (rebuild or compaction) it will be the default compression setting at that time ... which may be different. this gentle rebuild over time strategy is why you don't see immediate benefit to a compression settings change unless you force a global compaction (which kinda just rebuilds all the cf's).
Owner

interesting. Last time I changed my compression algo with active data, i returned to fires. Might've just been a me thing though 🤷‍♀️

interesting. Last time I changed my compression algo with active data, i returned to fires. Might've just been a me thing though 🤷‍♀️
Author
Contributor

If that is the case, I am banking on removing the warning and adding !admin query raw compact as a viable instruction (I did run it in my previous algo hotswap, and it shrinked the db by a third). Or at least changing the warning to SHOULD, rather than MUST. Let me know of a good way to write this one.

If that is the case, I am banking on removing the warning and adding `!admin query raw compact` as a viable instruction (I did run it in my previous algo hotswap, and it shrinked the db by a third). Or at least changing the warning to SHOULD, rather than MUST. Let me know of a good way to write this one.
Owner

https://mintlify.wiki/facebook/rocksdb/advanced/compression#per-level-compression this is a very pretty way to read their config settings

Let me know if you find anything saying it's not true or you tried it and it failed, I would love to know. I thought you could change from ex lz4 default to zstd etc it just takes "rewriting the level" to make it real.

https://mintlify.wiki/facebook/rocksdb/advanced/compression#per-level-compression this is a very pretty way to read their config settings Let me know if you find anything saying it's not true or you tried it and it failed, I would love to know. I thought you could change from ex lz4 default to zstd etc it just takes "rewriting the level" to make it real.
Owner
also this is very recent and very cool https://rocksdb.org/blog/2025/10/08/parallel-compression-revamp.html
Author
Contributor

I took the liberty to retrofit the warning popup, saying that db compression algos should be tuned without active data, due to lack of real world success. I hope this strike a fair level of recommendation for official docs without being too incorrect.

I took the liberty to retrofit the warning popup, saying that db compression algos should be tuned without active data, due to lack of real world success. I hope this strike a fair level of recommendation for official docs without being too incorrect.
@ -0,0 +100,4 @@
### Using UNIX sockets
If your homeserver and reverse proxy live on the same machine, you may wish to expose Continuwuity on a UNIX socket instead of a port. This reduces TCP overhead between the two programs.
Owner

nit: reduces -> removes the (unix sockets have zero TCP overhead since they aren't TCP)

nit: `reduces` -> `removes the` (unix sockets have zero TCP overhead since they aren't TCP)
stratself marked this conversation as resolved
@ -0,0 +135,4 @@
trusted_servers = ["fastest.example.com","faster.example.com","matrix.org"]
```
Avoid using `matrix.org` as your primary notary, as it tends to be quite slow. If you need suggestions for trusted servers, ask in the Continuwuity main room.
Owner

Mixing technical terminology ("notary") with the layman's terms ("trusted server(s)") isn't a good idea, should probably consistently choose one or the other.

Also, people shouldn't ask for suggestions, they should put in servers that they trust. The homeservers of project maintainers are likely a good hint (since they're already trusting us to write the software they're using anyway, but please also don't just tell people to trust maintainer homeservers outright), but it should probably be emphasised that trusted servers are trusted, and if they lie to you, may be able to cause irreversible harm to your deployment in one way or another.

Mixing technical terminology ("notary") with the layman's terms ("trusted server(s)") isn't a good idea, should probably consistently choose one or the other. Also, people shouldn't ask for suggestions, they should put in servers that *they* trust. The homeservers of project maintainers are likely a good *hint* (since they're already trusting us to write the software they're using anyway, but please also don't just tell people to trust maintainer homeservers outright), but it should probably be emphasised that trusted servers are ***trusted***, and if they lie to you, may be able to cause irreversible harm to your deployment in one way or another.
Author
Contributor

Removed "notary". Included some actual servers that maintainers recommended over the months, and a notice to vet them. I can change the servers as needed before merging, and make the notice an admonition block for extra poppiness. Not the best solution so let me know of any changes needed

Edit: I made the trusted_server vetting into a large blue infobox

Removed "notary". Included some actual servers that maintainers recommended over the months, and a notice to vet them. I can change the servers as needed before merging, and make the notice an admonition block for extra poppiness. Not the best solution so let me know of any changes needed Edit: I made the trusted_server vetting into a large blue infobox
@ -0,0 +178,4 @@
Consider enabling the newer **HTTP/3** protocol for inbound connections to Continuwuity. In Caddy this is allowed by default, and you'd need to expose port :443/**udp** on your firewall.
HTTP/3 support is mostly beneficial for faster Client-Server connections, especially in browser-based applications like Element or Cinny. Continuwuity includes experimental _outbound_ HTTP/3 support in its Docker images, so connections between Continuwuity servers can benefit from this too.
Owner

Why? (include some detail on the benefits of HTTP and how it applies to matrix, something like lower latency and connection overhead or whatever the actual benefits of HTTP3 are I forget)

Why? (include some detail on the benefits of HTTP and how it applies to matrix, something like lower latency and connection overhead or whatever the actual benefits of HTTP3 are I forget)
Author
Contributor

Included some general stuff, mostly has to do with clients on unstable network + faster conn establishment. Mobile clients don't support it yet but I don't wanna mention current status quo too much

Included some general stuff, mostly has to do with clients on unstable network + faster conn establishment. Mobile clients [don't support it yet](https://github.com/matrix-org/matrix-rust-sdk/issues/6343) but I don't wanna mention current status quo too much
stratself marked this conversation as resolved
docs(perf): Various changes from feedback
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Check Changelog / Check for changelog (pull_request_target) Successful in 11s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 3m30s
Update flake hashes / update-flake-hashes (pull_request) Successful in 52s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 13m25s
7cf74983d1
* Only recommend turning off all presences
* Add example trusted_servers and notice to vet them
* Add explained benefits of HTTP/3
* Some grammar nits
docs(perf): Remove section on .well-known
Some checks failed
Check Changelog / Check for changelog (pull_request_target) Successful in 12s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 3m10s
Update flake hashes / update-flake-hashes (pull_request) Successful in 50s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 13m16s
7aa6d0305c
It is not a clear performance gain, and should be added
later in delegation.mdx
stratself force-pushed stratself/docs-perf-tuning from 7aa6d0305c
Some checks failed
Check Changelog / Check for changelog (pull_request_target) Successful in 12s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 3m10s
Update flake hashes / update-flake-hashes (pull_request) Successful in 50s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 13m16s
to 8cf2c19f86
Some checks failed
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 2m57s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 15m13s
2026-04-08 07:47:04 +00:00
Compare
stratself force-pushed stratself/docs-perf-tuning from 8cf2c19f86
Some checks failed
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 2m57s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 15m13s
to 2de48990b1
Some checks failed
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 1m18s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 10m49s
2026-04-11 16:08:01 +00:00
Compare
Henry-Hiles requested changes 2026-04-13 14:25:39 +00:00
Dismissed
@ -0,0 +8,4 @@
## Cache capacities
If you have unused memory to spare, consider increasing the `cache_capacity_modifier` value to a larger number to allow more data to be stored in hot memory. This *significantly* speed up many intensive operations (such as state resolutions) and decreases CPU usage and disk I/O. Start with a baseline of `cache_capacity_modifier = 2.0` and tune up until you are satisfied with RAM usage.
Contributor
- If you have unused memory to spare
+ If you have memory to spare

s/speed/speeds

```diff - If you have unused memory to spare + If you have memory to spare ``` s/speed/speeds
stratself marked this conversation as resolved
@ -0,0 +128,4 @@
### Enable HTTP/3 on your reverse proxy
Consider enabling the newer **HTTP/3** protocol for inbound connections to Continuwuity. In Caddy HTTP/3 is allowed by default, and you'd need to expose port :443/**udp** on your firewall.
Contributor

s/and you'd need to/but you must

s/and you'd need to/but you must
stratself marked this conversation as resolved
stratself force-pushed stratself/docs-perf-tuning from 2de48990b1
Some checks failed
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 1m18s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 10m49s
to 4998266d17
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 35s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 1m8s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 14m26s
2026-04-14 16:46:21 +00:00
Compare
Author
Contributor

Done wording fixes and rebased to main. Edited _meta.json to absorb changes from merged #1601.

Done wording fixes and rebased to main. Edited `_meta.json` to absorb changes from merged #1601.
stratself force-pushed stratself/docs-perf-tuning from 4998266d17
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 35s
Checks / Prek / Pre-commit & Formatting (pull_request) Failing after 1m8s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 14m26s
to d24b653af2
All checks were successful
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 30s
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 1m33s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 9m55s
2026-04-16 10:04:56 +00:00
Compare
docs(perf): Improve introduction wording
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 35s
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 1m57s
Update flake hashes / update-flake-hashes (pull_request) Successful in 2m14s
Deploy Element Web / 🏗️ Build and Deploy (pull_request) Failing after 8m6s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 9m41s
f5341201a2
Some checks failed
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Changelog / Check changelog is added (pull_request_target) Successful in 35s
Required
Details
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 1m57s
Required
Details
Update flake hashes / update-flake-hashes (pull_request) Successful in 2m14s
Deploy Element Web / 🏗️ Build and Deploy (pull_request) Failing after 8m6s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 9m41s
Required
Details
This pull request is blocked because it's outdated.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u stratself/docs-perf-tuning:stratself-stratself/docs-perf-tuning
git switch stratself-stratself/docs-perf-tuning
Sign in to join this conversation.
No milestone
No project
No assignees
6 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
continuwuation/continuwuity!1498
No description provided.