'limited' flag is set on sync v3 room joins #839

Open
opened 2025-05-25 00:53:34 +00:00 by Jade · 8 comments
Owner

https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/src/api/client/sync/v3.rs#L873-L878

Here, joined_since_last_sync is true if the user is joining the room for the "first time" in terms of this sync (e.g., initial sync, or joined since the last sync since token). The specification describes the limited flag as indicating that the server has omitted some events from the current batch due to a large number of events occurring since the last sync, creating a "gap" that the client can backfill.

Setting limited to true just because joined_since_last_sync is true might be a misinterpretation - given clients already have a pagination token here.

Investigate client behaviour?

Related to #779 ?

https://forgejo.ellis.link/continuwuation/continuwuity/src/branch/main/src/api/client/sync/v3.rs#L873-L878 Here, joined_since_last_sync is true if the user is joining the room for the "first time" in terms of this sync (e.g., initial sync, or joined since the last sync since token). The specification describes the limited flag as indicating that the server has omitted some events from the current batch due to a large number of events occurring since the last sync, creating a "gap" that the client can backfill. Setting limited to true just because joined_since_last_sync is true might be a misinterpretation - given clients already have a pagination token here. Investigate client behaviour? Related to https://forgejo.ellis.link/continuwuation/continuwuity/issues/779 ?
Owner

Even if this is related to 799, not sure fixing this would fix that too - 799's issue is the room state never makes it down sync. Definitely worth looking into though

Even if this is related to 799, not sure fixing this would fix that too - 799's issue is the room state never makes it down sync. Definitely worth looking into though
Owner

okay what im gathering is our sync watcher is just a bit borky over all

okay what im gathering is our sync watcher is just a bit borky over all
Owner

To add some additional context, my main server (nexy7574.co.uk) has been using near 100% of its allocated CPU for several months straight now. I assumed this was just due to being in far too many rooms, and a consequence of federation traffic. Today, I took a poke at my reverse proxy logs to see if it was someone deliberately spamming me, and yes, it was someone deliberately spamming me. It was me, spamming /sync 30 times a second, per session, across about 5 sessions.

Things I noticed while fixing:

  • One of the rooms one of the users stuck in a syncspam loop had state reset them out of the room, even on the origin server. I had to manually re-issue their membership state event as leave for it to stop spamming sync.
  • There was a room that itself was the victim of some state resets, particularly bad ones. I dispatched a /leave from the user that was stuck in this, however, that leave doesn't look to have reflected into the room. Possible that the state assumed the sender was never in there.
  • Other users on the same server are in these problematic rooms with no problems.

I suspect what is happening is the server is possibly using stale membership events to determine if a user is in a room, and then realising that they aren't actually based on the current state while generating the sync payload, and then just getting confused and returning a limited timeline. This sounds somewhat related to the aforementioned joined_since_last_sync.

To add some additional context, my main server (nexy7574.co.uk) has been using near 100% of its allocated CPU for several months straight now. I assumed this was just due to being in far too many rooms, and a consequence of federation traffic. Today, I took a poke at my reverse proxy logs to see if it was someone deliberately spamming me, and yes, it was someone deliberately spamming me. It was me, spamming /sync 30 times a second, per session, across about 5 sessions. Things I noticed while fixing: - One of the rooms one of the users stuck in a syncspam loop had state reset them out of the room, even on the origin server. I had to manually re-issue their membership state event as leave for it to stop spamming sync. - There was a room that itself was the victim of some state resets, particularly bad ones. I dispatched a /leave from the user that was stuck in this, however, that leave doesn't look to have reflected into the room. Possible that the state assumed the sender was never in there. - Other users on the same server are in these problematic rooms with no problems. I suspect what is happening is the server is possibly using stale membership events to determine if a user is in a room, and then realising that they aren't actually based on the current state while generating the sync payload, and then just getting confused and returning a limited timeline. This sounds somewhat related to the aforementioned `joined_since_last_sync`.
Owner
{
  "next_batch": "123",
   "join": {
     "!bad-room:nexy7574.co.uk": {
        "timeline": {
            "limited": true, "events": []
        }
    }
}

This is what each sync response looked like, for future reference. The next_batch never actually incremented (apart from when expected)

```json { "next_batch": "123", "join": { "!bad-room:nexy7574.co.uk": { "timeline": { "limited": true, "events": [] } } } ``` This is what each sync response looked like, for future reference. The next_batch never actually incremented (apart from when expected)
Owner

I also had this issue with some matrix dot org room. I used the force-leave-room command to fix, and the room limited: true with no events is no longer in my element sync replies

I also had this issue with some matrix dot org room. I used the force-leave-room command to fix, and the room limited: true with no events is no longer in my element sync replies
Owner

I think that this is correct and spec-compliant behavior. Synapse sets limited on new room joins, and a newly joined room will necessarily have a limited timeline because there's a large gap between the m.room.create event and the m.room.member event of the joining member.

I think that this is correct and spec-compliant behavior. Synapse sets `limited` on new room joins, and a newly joined room will necessarily have a limited timeline because there's a large gap between the `m.room.create` event and the `m.room.member` event of the joining member.
Owner

@ginger wrote in #839 (comment):

I think that this is correct and spec-compliant behavior. Synapse sets limited on new room joins, and a newly joined room will necessarily have a limited timeline because there's a large gap between the m.room.create event and the m.room.member event of the joining member.

it is when the limited timeline is only sent down sync once - the effect of this issue in particular is that each call to /sync returns this same limited timeline, which results in sync rates in the kHz range for really eager clients.

@ginger wrote in https://forgejo.ellis.link/continuwuation/continuwuity/issues/839#issuecomment-21214: > I think that this is correct and spec-compliant behavior. Synapse sets `limited` on new room joins, and a newly joined room will necessarily have a limited timeline because there's a large gap between the `m.room.create` event and the `m.room.member` event of the joining member. it is when the limited timeline is only sent down sync once - the effect of this issue in particular is that each call to /sync returns this same limited timeline, which results in sync rates in the kHz range for *really* eager clients.
Owner

@nex wrote in #839 (comment):

@ginger wrote in #839 (comment):

I think that this is correct and spec-compliant behavior. Synapse sets limited on new room joins, and a newly joined room will necessarily have a limited timeline because there's a large gap between the m.room.create event and the m.room.member event of the joining member.

it is when the limited timeline is only sent down sync once - the effect of this issue in particular is that each call to /sync returns this same limited timeline, which results in sync rates in the kHz range for really eager clients.

I believe the expected behavior here is for clients to trigger a backfill through /messages. If clients aren't doing that we could start automatically backfilling after joining a room.

@nex wrote in https://forgejo.ellis.link/continuwuation/continuwuity/issues/839#issuecomment-21224: > @ginger wrote in #839 (comment): > > > I think that this is correct and spec-compliant behavior. Synapse sets `limited` on new room joins, and a newly joined room will necessarily have a limited timeline because there's a large gap between the `m.room.create` event and the `m.room.member` event of the joining member. > > it is when the limited timeline is only sent down sync once - the effect of this issue in particular is that each call to /sync returns this same limited timeline, which results in sync rates in the kHz range for _really_ eager clients. I believe the expected behavior here is for clients to trigger a backfill through `/messages`. If clients aren't doing that we could start automatically backfilling after joining a room.
Sign in to join this conversation.
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
continuwuation/continuwuity#839
No description provided.