Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix ystore start flow #40

Merged
merged 1 commit into from
May 2, 2024
Merged

Conversation

jzhang20133
Copy link
Collaborator

@jzhang20133 jzhang20133 commented May 1, 2024

This lineself.db_initialized = Event() in start() method replace self.db_initialized with a new Event() and when there is another call to ystore.read() or ystore.write() ahead of db initialization, this another call could be wait for the old self.db_initialized Event (created in initialize method) which is never set. In UI, file access is blocked and the loading sign keeps spinning forever.

Address this open issue:
#41

@jzhang20133 jzhang20133 requested a review from Zsailer May 1, 2024 22:36
@Zsailer
Copy link
Member

Zsailer commented May 1, 2024

Nice find @jzhang20133!

I think this removes the wrong line though. I think we should remove the initialization that happens in the __init__ method instead.

@jzhang20133 jzhang20133 added the bug Something isn't working label May 2, 2024
@jzhang20133 jzhang20133 requested a review from davidbrochart May 2, 2024 00:38
@jzhang20133
Copy link
Collaborator Author

In the test_version method, we want to be able to restart ystore after its db is closed. Hence we are creating a new Event() at bottom of stop method so when it is restarted, db can be initialized again.

Copy link
Collaborator

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the YStore should be started before any read/write operation is done, this is not an issue.
I would do the opposite though: remove the event creation in __init__ and keep it in __start__. This way any attempt at reading/writing before starting the YStore will fail.

pycrdt_websocket/ystore.py Outdated Show resolved Hide resolved
@Zsailer
Copy link
Member

Zsailer commented May 2, 2024

@davidbrochart This PR is in response to another issue we are seeing where documents never open (i.e. spinning wheel forever).

We haven't been able to reproduce with vanilla JupyterLab + jupyter-collaboration + pycrdt-websocket yet; though, admittedly we haven't had a lot of time to try to create an example in the vanilla case.

The only thing that is different on our end is we do some authentication that could take 1-2 seconds before connecting to a room. I think we're seeing this latency cause some issues with the db_initialization.

@davidbrochart
Copy link
Collaborator

Would you mind opening an issue, that this PR references? Otherwise it gets difficult understanding what this PR is supposed to fix.

@jzhang20133
Copy link
Collaborator Author

jzhang20133 commented May 2, 2024

@davidbrochart I have updated the PR to follow your suggestions to remove self.db_initialized in init method. Would you like to review it one more time? cc @Zsailer

@jzhang20133
Copy link
Collaborator Author

jzhang20133 commented May 2, 2024

With current implementation, the room initialization method in YDocWebsocketHandler which load contents from ystore will fail directly instead of waiting for ystore to start. Due to that failure, websocket is teared down and ystore got stopped and there is no time to initialize db and file can't be opened.

[I 2024-05-02 10:31:55.199 ServerApp] Request for Y document 'TestRTC2/Untitled59.ipynb' with room ID: f851cc33-069d-4d70-9b56-6e369d48d066
[I 2024-05-02 10:31:55.581 YDocExtension] Creating FileLoader for: TestRTC2/Untitled59.ipynb
[I 2024-05-02 10:31:55.585 YDocExtension] Watching file: TestRTC2/Untitled59.ipynb
[I 2024-05-02 10:31:55.588 ServerApp] Initializing room json:notebook:f851cc33-069d-4d70-9b56-6e369d48d066
[E 2024-05-02 10:31:55.594 ServerApp] Error initializing: TestRTC2/Untitled59.ipynb
    RuntimeError('ystore is not started')
    Traceback (most recent call last):
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/jupyter_collaboration/handlers.py", line 233, in open
        await self.room.initialize()
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/jupyter_collaboration/rooms.py", line 104, in initialize
        await self.ystore.apply_updates(self.ydoc)
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/pycrdt_websocket/ystore.py", line 137, in apply_updates
        async for update, *rest in self.read():
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/pycrdt_websocket/ystore.py", line 417, in read
        raise RuntimeError("ystore is not started")
    RuntimeError: ystore is not started
[E 2024-05-02 10:31:55.599 ServerApp] Failed to write message
    Traceback (most recent call last):
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/jupyter_collaboration/handlers.py", line 266, in send
        self.write_message(message, binary=True)
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/websocket.py", line 332, in write_message
        raise WebSocketClosedError()
    tornado.websocket.WebSocketClosedError
[I 2024-05-02 10:31:55.600 ServerApp] Deleting Y document from memory: json:notebook:f851cc33-069d-4d70-9b56-6e369d48d066
[I 2024-05-02 10:31:55.600 ServerApp] Room json:notebook:f851cc33-069d-4d70-9b56-6e369d48d066 deleted
[I 2024-05-02 10:31:55.601 ServerApp] Deleting file TestRTC2/Untitled59.ipynb
[E 2024-05-02 10:31:55.674 ServerApp] Uncaught exception GET /api/collaboration/room/json:notebook:f851cc33-069d-4d70-9b56-6e369d48d066?sessionId=007deb49-2495-4202-9360-6eea43b39253 (127.0.0.1)
    HTTPServerRequest(protocol='http', host='localhost:8888', method='GET', uri='/api/collaboration/room/json:notebook:f851cc33-069d-4d70-9b56-6e369d48d066?sessionId=007deb49-2495-4202-9360-6eea43b39253', version='HTTP/1.1', remote_ip='127.0.0.1')
    Traceback (most recent call last):
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/web.py", line 1790, in _execute
        result = await result
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/jupyter_collaboration/handlers.py", line 209, in get
        return await super().get(*args, **kwargs)
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/websocket.py", line 273, in get
        await self.ws_connection.accept_connection(self)
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/websocket.py", line 863, in accept_connection
        await self._accept_connection(handler)
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/websocket.py", line 946, in _accept_connection
        await self._receive_frame_loop()
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/websocket.py", line 1105, in _receive_frame_loop
        self.handler.on_ws_connection_close(self.close_code, self.close_reason)
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/websocket.py", line 571, in on_ws_connection_close
        self.on_connection_close()
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/tornado/websocket.py", line 563, in on_connection_close
        self.on_close()
      File "/Users/jialinzhang/miniconda3/envs/jlab4-fresh/lib/python3.10/site-packages/jupyter_collaboration/handlers.py", line 333, in on_close
        if isinstance(self.room, DocumentRoom) and self.room.clients == [self]:
    AttributeError: 'YDocWebSocketHandler' object has no attribute 'room'

We would need to start ystore and then await for db_initialized in prepare method to get out of this race condition.

@Zsailer
Copy link
Member

Zsailer commented May 2, 2024

We would need to start ystore and then await for db_initialized in prepare method to get out of this race condition.

Have you verified that this works?

@jzhang20133
Copy link
Collaborator Author

jzhang20133 commented May 2, 2024

After adding a line await ystore.start() in YDocWebsocketHandler prepare method, file access will work. @Zsailer PR is raised here. jupyterlab/jupyter-collaboration#299

@jzhang20133
Copy link
Collaborator Author

I also have to change ystore to await _init_db method within start method to get it working. cc @Zsailer @davidbrochart

self.db_initialized = Event()
if from_context_manager:
assert self._task_group is not None
self._task_group.start_soon(self._init_db)
await self._init_db()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jzhang20133 is there a specific reason you removed the start_soon here? Was this causing a race condition?

I ask because I think keeping this as part of the top level task_group is the "proper" way to structure concurrency here in an AnyIO world. Scheduling one-off tasks like this outside of the task_group might cause some issues, though I'm not experienced enough in AnyIO to know for sure.

Copy link
Collaborator Author

@jzhang20133 jzhang20133 May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._task_group.start_soon(self._init_db) will schedule this self._init_db task to run later in task group. And self._task_group.start_soon(self._init_db) will not wait for _init_db to run and once task is scheduled, it moves to next line and finish this method. For caller of ystore, when they call ystore.start(), after start() finish, it only make sure self._init_db is scheduled and there is no guarantee that it has been run. Hence I changes it to wait here so if caller also await ystore.start(), it can guarantee that self._init_db has been run and finished and after that, when other method of ystore is called, they don't throw runtimeError due to ystore is not running. In DocumentRoom, when loading contents from file or ystore, it calls ystore.apply_updates without knowing whether ystore is db_initialized or not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then wouldn't self.db_initialized not be set, so any read/write calls would wait?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a test and it looks like there is no need to wait self._init_db here. As long as we await ystore.start call right after the ystore initialization in jupyter-collaboration and make sure self.db_initialized = Event() is called before any other methods of ystore are called, then we are good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for ystore class, after initialization, we also need to call start first before calling any other methods otherwise other methods will fail.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the PR accordingly now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @jzhang20133

@jzhang20133
Copy link
Collaborator Author

added an open issue here #41 cc @davidbrochart @Zsailer

@Zsailer
Copy link
Member

Zsailer commented May 2, 2024

Looks good to me! Thanks @jzhang20133 for the great work!

@Zsailer Zsailer merged commit 8e85b26 into jupyter-server:main May 2, 2024
7 checks passed
Copy link
Collaborator

@davidbrochart davidbrochart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jzhang20133 Could you send a follow-up PR with my suggested changes?

@@ -361,7 +359,7 @@ async def start(

async def stop(self) -> None:
"""Stop the store."""
if self.db_initialized.is_set():
if hasattr(self, "db_initialized") and self.db_initialized.is_set():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that, db_initialized should be set to None in __init__ and you should check for None here instead of looking up for an attribute on the class instance:

if self.db_initialized is not None and self.db_initialized.is_set():

@@ -415,6 +413,8 @@ async def read(self) -> AsyncIterator[tuple[bytes, bytes, float]]:
Returns:
A tuple of (update, metadata, timestamp) for each update.
"""
if not hasattr(self, "db_initialized"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if self.db_initialized is None:

@@ -438,6 +438,8 @@ async def write(self, data: bytes) -> None:
Arguments:
data: The update to store.
"""
if not hasattr(self, "db_initialized"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if self.db_initialized is None:

@davidbrochart
Copy link
Collaborator

@jzhang20133 Could you send a follow-up PR with my suggested changes?

I opened #42.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants