Skip to content

PUT requests timeout in network mode - gateway never sends PutResponse #2035

@sanity

Description

@sanity

PUT Request Timeout in Network Mode - Room Creation Fails

Summary

River room creation fails in freenet network mode because PUT requests sent to the gateway never receive responses, causing rooms to remain in "Subscribing" status indefinitely. This blocks all subsequent updates and prevents message synchronization between users.

Environment

  • Freenet version: 0.1.36
  • River UI build: 2025-10-29T22:52:21Z
  • Mode: freenet network
  • Node type: Non-gateway peer (single connection to gateway)
  • Gateway: v6MWKgqHiBMNcGtG (@ 0.923911415153386)

Reproduction Steps

  1. Start freenet in network mode: freenet network
  2. Open River UI in browser: http://127.0.0.1:7509/v1/contract/web/raAqMhMG7KUpXBU2SxgCQ3Vh4PYjttxdSWd9ftV7RLv/
  3. Create a new room (e.g., "TestRoom")
  4. Observe that room appears locally but status remains "Subscribing"
  5. Send a message - observe it doesn't sync to other users
  6. Open River in second browser window - observe the room is not available

Expected Behavior

  1. User creates room
  2. River sends PUT request to freenet
  3. Freenet stores contract and returns PutResponse
  4. River receives response, marks room as "Subscribed"
  5. Updates and messages can now be sent via UPDATE requests
  6. Other users can GET the room and subscribe

Actual Behavior

  1. ✅ User creates room successfully (local UI works)
  2. ✅ River sends PUT request to freenet
  3. Gateway receives request but never sends PutResponse
  4. ❌ Room stuck in "Subscribing" status indefinitely
  5. ❌ All updates blocked: "Room doesn't need update - not subscribed (status: Subscribing)"
  6. ❌ Other users cannot see the room (contract not in network)

Evidence from Logs

River UI Console Logs

USER1: Sending PutRequest for room MemberId(6QN6RFMM) with contract ID: 6NTyUbWKxtNPDYzM3t5cmJkPUa8KK24Ds94fpnTHdbcM
USER1: Sent PutRequest for room MemberId(6QN6RFMM)
USER1: Room MemberId(6QN6RFMM) - sync status: Subscribing, has last synced: true, states match: false
USER1: Room MemberId(6QN6RFMM) doesn't need update - not subscribed (status: Subscribing)

The room never transitions from "Subscribing" to "Subscribed".

Freenet Node Logs

PUT Request Sent Successfully:

2025-10-29T22:52:21.838659Z INFO freenet::node::request_router: Created new operation - starting network request, transaction: 01K8S2R5YEEPPDV9JCJE60ANG1, resource: Put { key: ContractKey { instance: ContractInstanceId("6NTyUbWKxtNPDYzM3t5cmJkPUa8KK24Ds94fpnTHdbcM"), ...
2025-10-29T22:52:21.884304Z INFO freenet::node::network_bridge::p2p_protoc: Sending outbound message to peer, tx: 01K8S2R5YEEPPDV9JCJE60ANG1, msg_type: Message {RequestPut(id: 01K8S2R5YEEPPDV9JCJE60ANG1)}, target_peer: v6MWKgqHiBMNcGtG (@ 0.923911415153386)
2025-10-29T22:52:21.884335Z INFO freenet::node::network_bridge::p2p_protoc: Message successfully sent to peer connection, tx: 01K8S2R5YEEPPDV9JCJE60ANG1, target_peer: v6MWKgqHiBMNcGtG (@ 0.923911415153386)

No Response Ever Received:

$ grep -i "ReturnPut\|PutResponse\|01K8S2R5YEEPPDV9JCJE60ANG1" /tmp/freenet.log | grep -v "RequestPut"
# ... only shows RequestPut being sent, NO ReturnPut or PutResponse

Contract Missing Later:

2025-10-29T22:52:34.549641Z ERROR freenet::client_events: update query failed: missing contract: 6NTyUbWKxtNPDYzM3t5cmJkPUa8KK24Ds94fpnTHdbcM, key: 6NTyUbWKxtNPDYzM3t5cmJkPUa8KK24Ds94fpnTHdbcM

Network Topology

current_connections: 1, is_gateway: false, pending_adds: 24

The node has:

  • Only 1 active connection (to gateway v6MWKgqHiBMNcGtG)
  • 24 pending connection attempts that never complete
  • Essentially isolated on the network

Related Issues and Recent Fixes

Similar Issues This Week

This appears related to other PUT/GET timeout issues addressed this week:

  1. PR fix: cache contract state locally before forwarding client-initiated PUT #2011: "fix: cache contract state locally before forwarding client-initiated PUT" (commit 5734a33f)

    • Status: Merged into v0.1.36
    • Purpose: Cache state locally BEFORE forwarding PUT to ensure publishing node has immediate access
    • Observation: Despite this fix, PutResponse still doesn't arrive from gateway
  2. PR fix: make get operation resilient to local caching failures #2020: "fix: make get operation resilient to local caching failures" (commit 7929dfe6)

    • Status: Merged into v0.1.36
    • Purpose: GET operations shouldn't fail if local caching fails
    • Observation: GET operations work better, but PUT still times out
  3. PR fix: waker registration and cross-node PUT response routing #1985: "fix: waker registration and cross-node PUT response routing" (commit a0b067de)

    • Status: Merged (earlier)
    • Purpose: Fix cross-node PUT response routing
    • Observation: May be relevant if response routing from gateway is broken
  4. Timeout Increase: Client response timeouts increased from 30s to 60s in freenet-ping (commit 4da57bd3)

    • Status: Only affects freenet-ping app, NOT River
    • Observation: River has no client-side timeout configured

Technical Analysis

Root Cause

The PUT request successfully reaches the gateway (Message successfully sent to peer connection), but the gateway never sends back a ReturnPut message with a PutResponse. This appears to be a gateway-side timeout or failure to process PUT requests in network mode.

Why Local Mode Works

In freenet local mode, the node IS the gateway - all operations are local. PUT requests are immediately cached and return responses. No network traversal needed.

Why Network Mode Fails

In freenet network mode:

  1. Node must forward PUT to remote gateway
  2. Gateway must cache the contract and return response
  3. Gateway fails to respond (timeout? processing error? routing issue?)
  4. Client never receives confirmation
  5. Room stuck in "Subscribing" status forever

Impact on River

This completely breaks multi-user functionality:

  • ❌ Cannot create rooms that other users can join
  • ❌ Cannot send messages that sync across users
  • ❌ Cannot accept invitations (GET requests also fail for missing contracts)
  • ✅ Single-user local testing works fine

Documented as Known Issue

From /home/ian/code/freenet/river/main/CLAUDE.md:

Known Issues

  1. Invitation Bug (2025-01-18): Room invitations hang at "Subscribing to room..."
    • Root cause: Contract PUT/GET operations timeout on live network
    • Works in integration tests but fails in production

This indicates the issue has been known since January 18, 2025.

Questions for Investigation

  1. Why is the gateway not responding to PUT requests?

    • Is there a gateway-side timeout?
    • Is there a processing error we're not seeing?
    • Is the routing of the response broken?
  2. Why are integration tests passing if this fails in production?

    • Do tests use local mode?
    • Do tests mock responses?
    • Is there a timing difference?
  3. Why is the node unable to establish more peer connections?

    • pending_adds: 24 suggests connection attempts failing
    • Is this related to the PUT timeout issue?
    • Does network isolation exacerbate the problem?
  4. Is there a client-side timeout mechanism missing?

    • freenet-ping has 60s timeout
    • River appears to have NO timeout
    • Should River implement timeout + retry logic?
  5. Do we need explicit PutResponse handling in River?

    • River has put_response.rs handler
    • Handler logs "Received PutResponse for contract ID" - never seen in logs
    • Is the handler correctly registered with WebSocket event loop?

Testing Notes

Playwright Test False Positive

Our automated Playwright test reported success despite this bug because it only checked:

  • ✅ Local UI state (room appears in room list)
  • ✅ Local message state (messages visible in same browser context)

It did NOT verify:

  • ❌ Actual Freenet network synchronization
  • ❌ Cross-browser/cross-user message visibility
  • ❌ PutResponse/GetResponse arrival

Manual testing revealed users cannot see each other's messages despite Playwright reporting success.

Proposed Solutions

Short-term Workarounds

  1. Document limitation: Update River docs to note multi-user only works in local mode
  2. Increase gateway timeouts: If gateway has configurable timeout, increase it
  3. Add retry logic: River could retry PUT after timeout (requires implementing timeout)

Long-term Fixes

  1. Debug gateway PUT handling: Instrument gateway to see why PutResponse isn't sent
  2. Implement client-side timeout: Add timeout mechanism like freenet-ping (60s)
  3. Improve network topology: Investigate why pending connections fail
  4. Add telemetry: Better visibility into PUT request lifecycle
  5. Response routing audit: Verify cross-node PUT response routing works

Files to Examine

River UI

  • /home/ian/code/freenet/river/main/ui/src/components/app/freenet_api/room_synchronizer.rs:225 - PutRequest send
  • /home/ian/code/freenet/river/main/ui/src/components/app/freenet_api/response_handler/put_response.rs:16 - PutResponse handler (never called)
  • /home/ian/code/freenet/river/main/ui/src/components/app/sync_info.rs:235 - Status check that blocks updates

Freenet Core

Test Reproduction

Automated test script: /tmp/test_invitation_with_logs.py

cd /tmp && source playwright_test/bin/activate
python /tmp/test_invitation_with_logs.py

This captures console logs from both users and demonstrates:

  1. User 1 creates room and sends message
  2. PUT request sent but no response
  3. User 2 accepts invitation
  4. GET request sent for same contract
  5. Neither user can see the other's messages

Environment Details

$ freenet --version
Freenet version: 0.1.36

$ ps aux | grep freenet
ian  5712  0.3  0.1 104032760 133628 ?  Sl  17:46  0:04 freenet network

$ grep "current_connections\|is_gateway" /tmp/freenet.log | tail -1
current_connections: 1, is_gateway: false, pending_adds: 24

Priority

HIGH - This completely breaks River's core functionality (multi-user chat) in production network mode. Only workaround is local mode which doesn't support actual peer-to-peer synchronization.


Note: This issue was discovered while debugging an infinite loop bug in River's sync logic. The infinite loop has been fixed (via "Sync Transaction" pattern with NEEDS_SYNC signal), but that fix revealed this underlying PUT timeout issue which was previously masked.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-contractsArea: Contract runtime, SDK, and executionA-networkingArea: Networking, ring protocol, peer discoveryP-highHigh priorityT-bugType: Something is broken

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions