fix: do not send bot user-agent to APIs #1581

Open
31a wants to merge 1 commit from 31a/continuwuity:pr-useragent into main
Contributor

currently continuwuity sends (bot; +https://continuwuity.org) in its user-agent in every request (except URL previews). this can lead to protocol requests (especially those for /.well-known/) being blocked by spam filters, and sending it in protocol requests serves no purpose.

this PR changes the user-agent back to continuwuity/version for all API requests (which is everything excluding external-redirect media downloads and URL previews)

this PR also updates the documentation to show the correct value for the default user-agent for URL previews.

Fixes: #1580

Pull request checklist:

  • This pull request targets the main branch, and the branch is named something other than
    main.
  • I have written an appropriate pull request title and my description is clear.
  • I understand I am responsible for the contents of this pull request.
  • I have followed the contributing guidelines:
<!-- In order to help reviewers know what your pull request does at a glance, you should ensure that 1. Your PR title is a short, single sentence describing what you changed 2. You have described in more detail what you have changed, why you have changed it, what the intended effect is, and why you think this will be beneficial to the project. If you have made any potentially strange/questionable design choices, but didn't feel they'd benefit from code comments, please don't mention them here - after opening your pull request, go to "files changed", and click on the "+" symbol in the line number gutter, and attach comments to the lines that you think would benefit from some clarification. --> currently continuwuity sends `(bot; +https://continuwuity.org)` in its user-agent in every request (except URL previews). this can lead to protocol requests (especially those for `/.well-known/`) being blocked by spam filters, and sending it in protocol requests serves no purpose. this PR changes the user-agent back to `continuwuity/version` for all API requests (which is everything excluding external-redirect media downloads and URL previews) this PR also updates the documentation to show the correct value for the default user-agent for URL previews. <!-- Example: This pull request allows us to warp through time and space ten times faster than before by double-inverting the warp drive with hyperheated jump fluid, both making the drive faster and more efficient. This resolves the common issue where we have to wait more than 10 milliseconds to engage, use, and disengage the warp drive when travelling between galaxies. --> <!-- Closes: #... --> Fixes: #1580 <!-- Uncomment the above line(s) if your pull request fixes an issue or closes another pull request by superseding it. Replace `#...` with the issue/pr number, such as `#123`. --> **Pull request checklist:** <!-- You need to complete these before your PR can be considered. If you aren't sure about some, feel free to ask for clarification in #dev:continuwuity.org. --> - [x] This pull request targets the `main` branch, and the branch is named something other than `main`. - [x] I have written an appropriate pull request title and my description is clear. - [x] I understand I am responsible for the contents of this pull request. - I have followed the [contributing guidelines][c1]: - [x] My contribution follows the [code style][c2], if applicable. - [x] I ran [pre-commit checks][c1pc] before opening/drafting this pull request. - [x] I have [tested my contribution][c1t] (or proof-read it for documentation-only changes) myself, if applicable. This includes ensuring code compiles. - [x] My commit messages follow the [commit message format][c1cm] and are descriptive. - [ ] I have written a [news fragment][n1] for this PR, if applicable<!--(can be done after hitting open!)-->. <!-- Notes on these requirements: - While not required, we encourage you to sign your commits with GPG or SSH to attest the authenticity of your changes. - While we allow LLM-assisted contributions, we do not appreciate contributions that are low quality, which is typical of machine-generated contributions that have not had a lot of love and care from a human. Please do not open a PR if all you have done is asked ChatGPT to tidy up the codebase with a +-100,000 diff. - In the case of code style violations, reviewers may leave review comments/change requests indicating what the ideal change would look like. For example, a reviewer may suggest you lower a log level, or use `match` instead of `if/else` etc. - In the case of code style violations, pre-commit check failures, minor things like typos/spelling errors, and in some cases commit format violations, reviewers may modify your branch directly, typically by making changes and adding a commit. Particularly in the latter case, a reviewer may rebase your commits to squash "spammy" ones (like "fix", "fix", "actually fix"), and reword commit messages that don't satisfy the format. - Pull requests MUST pass the `Checks` CI workflows to be capable of being merged. This can only be bypassed in exceptional circumstances. If your CI flakes, let us know in matrix:r/dev:continuwuity.org. - Pull requests have to be based on the latest `main` commit before being merged. If the main branch changes while you're making your changes, you should make sure you rebase on main before opening a PR. Your branch will be rebased on main before it is merged if it has fallen behind. - We typically only do fast-forward merges, so your entire commit log will be included. Once in main, it's difficult to get out cleanly, so put on your best dress, smile for the cameras! --> [c1]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md [c2]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/docs/development/code_style.mdx [c1pc]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#pre-commit-checks [c1t]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#running-tests-locally [c1cm]: https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/CONTRIBUTING.md#commit-messages [n1]: https://towncrier.readthedocs.io/en/stable/tutorial.html#creating-news-fragments
fix: do not send bot user-agent to APIs
All checks were successful
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 3m2s
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 19m46s
43ce518050
@ -38,3 +42,4 @@
format!("{}/{} (embedbot; facebookexternalhit/1.1; +{WEBSITE})", name(), version_ua())
}
fn init_user_agent_bot() -> String { format!("{}/{} (bot; +{WEBSITE})", name(), version_ua()) }
Owner

What - URL previews already have their own UA directly above this

What - URL previews already have their own UA directly above this
Author
Contributor

but external media queries do not. whether downloading random images directly with a user-agent for previews will break anything depends on how exactly facebook uses facebookexternalhit, but 31a decided to just not change the current behavior

but external media queries do not. whether downloading random images directly with a user-agent for previews will break anything depends on how exactly facebook uses `facebookexternalhit`, but 31a decided to just not change the current behavior
Owner

External media is just a federation endpoint.

External media is just a federation endpoint.

Please add a changelog fragment to changelog.d/ describing your changes.

<!-- changelog-check-action --> Please add a changelog fragment to `changelog.d/` describing your changes.
@ -32,4 +36,3 @@
#[inline]
pub fn user_agent_media() -> &'static str { USER_AGENT_MEDIA.get_or_init(init_user_agent_media) }
fn init_user_agent() -> String { format!("{}/{} (bot; +{WEBSITE})", name(), version_ua()) }
Owner

Is there a reason to not include the website?

Is there a reason to *not* include the website?
Owner

this can lead to protocol requests [...] being blocked by spam filters

This is not a reason to change metadata - remotes should be configured to not block legitimate traffic.

> this can lead to protocol requests [...] being blocked by spam filters This is not a reason to change metadata - remotes should be configured to not block legitimate traffic.
Author
Contributor

@nex wrote in #1581 (comment):

this can lead to protocol requests [...] being blocked by spam filters

This is not a reason to change metadata - remotes should be configured to not block legitimate traffic.

31a actually agrees with that, but everyone else seems not to

@nex wrote in https://forgejo.ellis.link/continuwuation/continuwuity/pulls/1581#issuecomment-26815: > > this can lead to protocol requests [...] being blocked by spam filters > > This is not a reason to change metadata - remotes should be configured to not block legitimate traffic. 31a actually agrees with that, but everyone else seems not to
First-time contributor

@nex
There's currently no server implementations that have a string "bot" in them, this is also not the common web convention, furthermore I'm not sure what bot would use S2S, so what makes the traffic legitimate in that case?

@nex There's currently no server implementations that have a string "bot" in them, this is also not the common web convention, furthermore I'm not sure what bot would use S2S, so what makes the traffic legitimate in that case?
Owner

I'm not sure why other servers are relevant here? Automated programs utilising "bot" is also a common web convention. The bot is the server.

I'm not sure why other servers are relevant here? Automated programs utilising "bot" is also a common web convention. The bot is the server.
Owner

bot means automated, which generally describes all federation traffic

`bot` means automated, which generally describes all federation traffic
First-time contributor

That'll be relevant if user would desire to filter bots for whatever reason, and Continuwuity is the only server implementation that will be filtered in that case, furthermore I never saw a server being the "bot" either.

That'll be relevant if user would desire to filter bots for whatever reason, and Continuwuity is the only server implementation that will be filtered in that case, furthermore I never saw a server being the "bot" either.
Owner

The server is literally the bot, that is what the server is, an automated program, potentially making autonomous requests. Aka a bot.
Again, remotes should not filter legitimate traffic. This is not a continuwuity problem.

The server is literally the bot, that is what the server is, an automated program, potentially making autonomous requests. Aka a bot. Again, remotes should not filter legitimate traffic. This is not a continuwuity problem.
All checks were successful
Check Changelog / Check for changelog (pull_request_target) Successful in 10s
Documentation / Build and Deploy Documentation (pull_request) Has been skipped
Checks / Prek / Pre-commit & Formatting (pull_request) Successful in 3m2s
Required
Details
Checks / Prek / Clippy and Cargo Tests (pull_request) Successful in 19m46s
Required
Details
This pull request is blocked because it's outdated.
This branch is out-of-date with the base branch
You are not authorized to merge this pull request.
View command line instructions

Checkout

From your project repository, check out a new branch and test the changes.
git fetch -u pr-useragent:31a-pr-useragent
git switch 31a-pr-useragent
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
continuwuation/continuwuity!1581
No description provided.