atproto-hub startup failure, subscribe never got an ndb context #1687

snarfed · 2025-01-14T04:30:51Z

Odd one just now. atproto-hub instance died at 18:43:17 PT, it restarted, and subscribe kept dying with an ndb context error every time it tried to load the Cursor from the datastore. It never managed to get past that and connect to the relay. Ugh. Lasted until ~19:33 PT when I noticed and restarted it.

Somewhat related to #1315.

Also our firehose processing delay metric was absent, so its alert didn't fire until after the restart. Double ugh. Need to do something about that.

Excerpted logs:

2025-01-13 18:43:17.000 [2025-01-14 02:43:17 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:49631) 
2025-01-13 18:43:17.000 Exception ignored in: <module 'threading' from '/layers/google.python.runtime/python/lib/python3.12/threading.py'>
2025-01-13 18:43:17.000 Traceback (most recent call last):
  File "/layers/google.python.runtime/python/lib/python3.12/threading.py", line 1594, in _shutdown
    atexit_call()
  File "/layers/google.python.runtime/python/lib/python3.12/concurrent/futures/thread.py", line 31, in _python_exit
    t.join()
  File "/layers/google.python.runtime/python/lib/python3.12/threading.py", line 1149, in join
    self._wait_for_tstate_lock()
  File "/layers/google.python.runtime/python/lib/python3.12/threading.py", line 1169, in _wait_for_tstate_lock
    if lock.acquire(block, timeout):
  File "/layers/google.python.pip/pip/lib/python3.12/site-packages/gunicorn/workers/base.py", line 204, in handle_abort
    sys.exit(1)
2025-01-13 18:43:17.000 SystemExit: 1
2025-01-13 18:43:18.000 [1] [ERROR] Worker (pid:49631) was sent SIGKILL! Perhaps out of memory?
...
Traceback (most recent call last):
  File "/workspace/atproto_firehose.py", line 123, in subscriber
    subscribe()
  File "/workspace/atproto_firehose.py", line 140, in subscribe
    cursor = Cursor.get_or_insert(
  File "google/cloud/ndb/_options.py", line 102, in wrapper
    return wrapped(*pass_args, **kwargs)
  File "google/cloud/ndb/utils.py", line 150, in positional_wrapper
    return wrapped(*args, **kwds)
  File "google/cloud/ndb/model.py", line 5995, in _get_or_insert
    return _cls._get_or_insert_async(_name, *args, **kwargs).result()
  File "google/cloud/ndb/tasklets.py", line 210, in result
    self.check_success()
  File "google/cloud/ndb/tasklets.py", line 157, in check_success
    raise self._exception
  File "google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "google/cloud/ndb/model.py", line 6098, in get_or_insert
    entity = yield key.get_async(_options=options)
  File "google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "google/cloud/ndb/key.py", line 943, in get
    entity_pb = yield _datastore_api.lookup(self._key, _options)
  File "google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "google/cloud/ndb/_datastore_api.py", line 165, in lookup
    entity_pb = yield batch.add(key)
  File "google/cloud/ndb/tasklets.py", line 319, in _advance_tasklet
    yielded = self.generator.throw(type(error), error, traceback)
  File "google/cloud/ndb/_retry.py", line 97, in retry_wrapper
    raise error
  File "google/cloud/ndb/_retry.py", line 82, in retry_wrapper
    result = yield result
  File "google/cloud/ndb/tasklets.py", line 323, in _advance_tasklet
    yielded = self.generator.send(send_value)
  File "google/cloud/ndb/_datastore_api.py", line 89, in rpc_call
    context = context_module.get_toplevel_context()
  File "google/cloud/ndb/context.py", line 151, in get_toplevel_context
    raise exceptions.ContextError()
google.cloud.ndb.exceptions.ContextError: No current context. NDB calls must be made in context established by google.cloud.ndb.Client.context.

The text was updated successfully, but these errors were encountered:

snarfed added the infra label Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

atproto-hub startup failure, subscribe never got an ndb context #1687

atproto-hub startup failure, subscribe never got an ndb context #1687

snarfed commented Jan 14, 2025

atproto-hub startup failure, subscribe never got an ndb context #1687

atproto-hub startup failure, subscribe never got an ndb context #1687

Comments

snarfed commented Jan 14, 2025