Skip to content

Releases: matrix-org/sliding-sync

v0.99.19

28 Jun 15:27
Compare
Choose a tag to compare

Improvements

  • Fix a critical issue which can cause undecryptable messages if the server admin drops/deletes the sliding sync proxy database. This would not only lose encryption keys in the database at that time, but would also prevent new encryption keys being delivered to clients for a potentially long period of time afterwards. Upon updating to this version of the proxy, new encryption keys will be delivered to clients correctly again. See #448 for more information.

What's Changed

  • Add grafana dashboard for SS proxy by @wrjlewis in #428
  • Protect against dropped databases by @kegsay in #449
  • Update Landing.md - add info on how to use extra_well_known_client_content in homeserver.yaml by @jessebot in #443

New Contributors

Full Changelog: v0.99.18...v0.99.19

v0.99.18

23 May 10:18
Compare
Choose a tag to compare

Improvements

  • Startup times are reduced by around 20-40% by utilising magical postgres queries over SELECT DISTINCT.
  • Slow room queries (determined by the "performance degraded" message) are now significantly faster.
  • Device list updates are now stored in a dedicated table rather than serialised as CBOR in the device data table. This will migrate data on the next startup which may take a few minutes. This significantly improves performance across the entire stack: reduced DB load, reduced # open DB conns, reduced CPU usage on the pollers, reduced memory usage for API requests, as shown with the following before/after grafana graph:
Screenshot 2024-05-23 at 11 16 25

Bug fixes

  • Ensure the since param in sync v2 is URL encoded.

What's Changed

  • Refactor device data by @kegsay in #437
  • Optimize getting the latest events in each room by @S7evinK in #436
  • fix: urlencode since opaque string by @gnieto in #439
  • e2ee extension: ensure null is not sent when we mean [] by @kegsay in #440
  • Optimize getting the latest events by type per room by @S7evinK in #441

New Contributors

Full Changelog: v0.99.17...v0.99.18

v0.99.17

10 May 13:33
Compare
Choose a tag to compare

As of this version, docker images are now signed.

Bug fixes

  • Fix two bugs which could cause device list updates to not be delivered to the client. This could cause problems decrypting events in E2EE rooms for newly logged in users.

What's Changed

  • README: document prebuild binaries by @csett86 in #425
  • Sign releases by @kegsay in #433
  • Ensure device list updates are robust to race conditions and network failures by @kegsay in #432

Full Changelog: v0.99.16...v0.99.17

v0.99.16

26 Apr 14:18
Compare
Choose a tag to compare

Bug fixes

  • Implement MSC4102 which fixes a bug around threaded read receipts which could cause a room to appear as permanently unread.
  • Fixed a bug where Sentry-reported values would be incorrect under high concurrency.
  • Fixed an issue which could cause duplicate account data events to be sent.

Improvements

  • Ensure clients cannot send maliciously large values for timeline_limit.
  • Improved OTLP spans
  • Fix race conditions in some tests
  • Filter out MSC4115 fields.
  • Periodically clean the state snapshots table every hour. When combined with AUTOVACUUM and a VACUUM FULL, this should reduce database sizes significantly (to somewhere between 10-30% of the original size).

What's Changed

  • Implement MSC4102 by @kegsay in #401
  • Add a sensible timeline_limit cap by @kegsay in #402
  • Add more spans for live updates; change when we early return from buffered events by @kegsay in #405
  • bugfix: ensure sentry username/id values are correct by @kegsay in #406
  • Fixes the spacing for the ENV list in README.md doc by @maxkratz in #395
  • CGO disabled to allow running on older distros by @csett86 in #393
  • Fix race conditions in tests by @kegsay in #410
  • bugfix: don't send dupe room account data by @kegsay in #418
  • Remove MSC4115 unsigned.membership field by @kegsay in #423
  • Clean the syncv3_snapshots table periodically by @kegsay in #422
  • Check for unstable prefix in MSC4115 by @kegsay in #424

New Contributors

Full Changelog: v0.99.15...v0.99.16

v0.99.15

22 Jan 12:32
Compare
Choose a tag to compare

Bug fixes

  • Fixed an issue which would cause metadata about a space child room to be erroneously sent to active connections, even if they weren't in the room. This did not leak any information that the client was not already authorised to see via the /hierarchy endpoint. This would appear in Element X as a new room in the room list with a room name and avatar, but no timeline history. Restarting Element X would remove the room.
  • Fixed an issue where the avatar field would be set to the other user even if the room was not a DM room.

What's Changed

  • bugfix: ensure metadata about space children doesn't leak to active connections by @kegsay in #392
  • Only emit avatars when DM room is set by @kegsay in #394

Full Changelog: v0.99.14...v0.99.15

v0.99.14

08 Jan 10:43
Compare
Choose a tag to compare

Bug fixes

  • Fixed a bug which prevented fallback key consumption from being correctly relayed to clients. For some clients (Element X) this would also cause a fallback key to not be uploaded at all, resulting in unable to decrypt errors if all one-time keys were consumed e.g by being offline for a long time. #390
  • Fixed a bug which prevented unix sockets working correctly contributed by @nisbet-hubbard in #385

New Contributors

Full Changelog: v0.99.13...v0.99.14

v0.99.13

04 Dec 11:27
e3b110e
Compare
Choose a tag to compare

New features

  • Add the ability to connect to the upstream homeserver and listen for sliding sync connections over Unix sockets. (By @cyberb, #378)

Bugfixes

  • Fix a bug introduced in v0.99.3 (#101) where sliding sync connections could be unintentionally deleted after another connection expires. (#382)
  • Fix a performance issue introduced in v0.99.2 (#71) where handling a large gappy state block could block polling for several minutes. (#363, #373, #377, #380)

New Contributors

Full Changelog: v0.99.12...v0.99.13

v0.99.12

09 Nov 17:13
Compare
Choose a tag to compare

Docker images available at: https://github.com/matrix-org/sliding-sync/pkgs/container/sliding-sync

Bug fixes

  • If a user had specific membership transitions (e.g invite -> join or join -> leave -> join) in a single timeline response (determined by timeline_limit) then the server could return an incorrect timeline, omitting events they had permission to see and returning a prev_batch token which would not recover those events.
  • An issue where rooms were not processed correctly by the proxy has been fixed and a migration has been applied in this version to recover those rooms. Affected rooms would not be visible to clients, or may be present as "stuck" invites which cannot be accepted/rejected.
  • Fixed a bug which would expose pre-emptive bans of a user in a room to that user.
  • Fixed a concurrent map r/w panic.

Improvements

  • When a connection is destroyed e.g because the access_token used is no longer valid, immediately time out the sync request rather than letting the ?timeout= be hit naturally.

What's Changed

New Contributors

Full Changelog: v0.99.11...v0.99.12

v0.99.11

11 Oct 15:35
Compare
Choose a tag to compare

Docker images available at: https://github.com/matrix-org/sliding-sync/pkgs/container/sliding-sync

Features

  • Add optional SYNCV3_DB_IDLE_TIMEOUT_SECS. On shared infrastructure where database connections are at a premium, a low idle timeout can be used to reap connections rapidly. On dedicated infrastructure where the database is just used for sliding sync, a high idle timeout can slightly improve performance when communicating with the database.
  • Room heroes can now be requested by specifying include_heroes: true. See MSC3575 for precise notation.
  • Add optional SYNCV3_HTTP_TIMEOUT_SECS and associated SYNCV3_HTTP_INITIAL_TIMEOUT_SECS, which configures the timeouts for sync v2 pollers. A high timeout is essential for large accounts, where initial sync v2 requests may take 10s of minutes. A lower timeout can be beneficial when network connectivity is interrupted. Having a large timeout (e.g 30min) may mean it can take up to 30 minutes for the proxy to realise that a sync v2 request has failed, during which time the device has not received any data. The proxy defaults to 30 minutes initial and 5 minutes otherwise. If network connectivity between the proxy and upstream homeserver is particularly poor, you may want to lower SYNCV3_HTTP_TIMEOUT_SECS.

Bug fixes

  • An issue where leave events could be received multiple times has been fixed.
  • Redactions now invalidate calculated data e.g redacting a room name automatically resends room.name with the new calculated name.
  • Fixed an issue when used with OIDC servers which could cause short-lived tokens to not start a poller, due to caching failures.
  • Fixed an issue whereby a bad state block could permanently stop a device poller from making forward progress.
  • When the proxy receives a limited sync, the timelines returned to sliding sync clients are now bounded by this value. This ensures that the client can fetch the missed events, and that the client doesn't "skip over" the events, which presents as a chunk of missing messages.
  • Stop pollers if the access token keeps returning errors for at least 50 minutes. Retrying forever just adds noise and load to the pollers.

Improvements

  • Sync v2 requests no longer request changes to presence. This should reduce load on the upstream homeserver.
  • If SYNCV3_SERVER cannot be reached at startup, log a warning. This should help to reduce configuration errors.
  • Removed temporary state block metrics. This could adversely affect /metrics collectors when under heavy load, due to it being a high cardinality metric. This could cause /metrics responses to be extremly large, enough to cause the collector to refuse to process the response, causing gaps in metrics collection.

What's Changed

  • Filter out presence events from /sync requests by @S7evinK in #292
  • Fix duplicate leave events by @S7evinK in #297
  • Better propagate request context by @sandhose in #301
  • Invalidate the global cache after a redaction by @DMRobertson in #296
  • bugfix: ensure that we always call EnsurePolling for expired tokens by @kegsay in #302
  • Add SYNCV3_DB_IDLE_TIMEOUT_SECS by @kegsay in #303
  • Remove per-device token migration added in #89 by @DMRobertson in #309
  • Add posibility to request room heroes by @S7evinK in #304
  • Make HTTP timeout configurable by @S7evinK in #308
  • bugfix: don't wedge pollers when they get bad state blocks by @kegsay in #299
  • Don't load events when there's a gap between known events by @DMRobertson in #300
  • Reduce mutex contention on callbacks by @kegsay in #311
  • Log a warning if the homeserver is unreachable at startup by @DMRobertson in #314
  • Update the poller metrics after a poller was started by @S7evinK in #312
  • Return a dummy /versions response in tests by @DMRobertson in #316
  • Disable constantly failing cypress tests by @DMRobertson in #317
  • Remove temporary metric by @kegsay in #322
  • Remove RequestedLatestEvents field from UserRoomData by @DMRobertson in #324
  • bugfix: give up polling if the /sync response keeps erroring for >50min by @kegsay in #326
  • Fix race condition in test by @kegsay in #337
  • bugfix: actually use the provided mock value in tests by @kegsay in #338
  • Build and attach {amd,arm}64 binaries to releases by @DMRobertson in #339

New Contributors

Full Changelog: v0.99.10...v0.99.11

v0.99.10

07 Sep 13:17
Compare
Choose a tag to compare

Bug fixes:

  • bugfix: set unsigned.redacted_because field on redaction by @kegsay in #291

Improvements:

  • set_presence=offline to reduce load on upstream homeservers by @S7evinK in #290
  • Add more tracing and Prometheus metrics by @S7evinK in #289

Full Changelog: v0.99.9...v0.99.10