-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple approach for concurrent opening of Notebooks #11
Comments
@JarCz I propose we discuss this in the next meeting and maybe we could start working on some of the required components after you finish the authentication stuff and we check that the basic file access functionality is working. |
Hi, I would like to summarize the topic, add questions and ask for further comments to make decision about implementation and divide topic into subtasks to move on. As I understand the possible flow would be as follows:
Would the users take turns in locking and having right to merge and the user holding the lock would have the right to merge changes from all the copies marked as ready, or would the merge right be limited to the owner only? When it comes to locking and unlocking, I guess it might be implemented when opening/closing file. To provide uniqness the naming convention for copies would be something like notebook_name-username-idp-opaque_id. When it comes to conflicts resolution, nbdime was suggested as a solution for diffing and merging. I am not sure how do we imagine using it on our side. If I am not mistaken, nbdime works as a command line tool and I saw that there is nbdime jupyter extension as well, but my concerns are that:
@diocas please comment in a spare moment. |
Hi, let me try to clarify the misunderstandings. Let me know if I succeeded :) As soon as you share a notebook/folder in RW, the other users have as much power as you. So there is no consideration of "owners" in here. The notebook get locked by whovever opens it first and the merge can be done by anyone. And the lock is automatic: as soon as you open the notebook, the system locks the file to prevent others from changing the same file (implemented, as you said, in opening/closing the file). So, if you share a notebook with me and I open it first, you will be the one that will get a read mode + write mode on a clone. For the merge, I'm open to suggestions. I imagine the following use cases:
Ofc, in all cases, in the review mode, you should be able to merge changes (which deletes), delete without changing anything and postpone the merge (for example, you changed something, I'm the one who is asked to merge but I don't know what should be merged, so I ignore it thus allowing you to merge at a later stage). For the merge. I suggested nbdime because I saw there was a UI which is what we want to show our userrs. I didn't see the details, nor tried the extension. But what we're looking for is a tool that checks the differences between 2 notebooks and suggests a merged version. One that also allows users revert changes or pick the changes from one of the notebooks. Basically, like any graphical diff tool that you can use for git. AFAIU, you don't need to have git underneath. I.e,
What do you mean?
If we can do it for other files (the merge), fine. But for now I would just lock them and prevent the editing and focus the diff in the notebooks. And ofc we should do this in steps. The first thing is locking and unlocking. Then, allowing/detecting the clones. And finally, the merge. |
As a sidenote, I'm not sure how to do the unlocking.. JupyterLab doesn't give the information that a file has been closed, right? It would be nice if it did. I think we can do the following:
(* we need to understand how that would work if the user has more than one window open.. Does lab allow this?) |
@diocas I analyzed the merge way and have some comments:
Before save the JupyterLab creates the GET request to check the current timestamp of the file, at in conflict situation when the file on disk is new, the user sees the message:
In this situation, I think, we can add a new button with the "Merge option" and merge in manual or create an event to the backend to merge two notebooks.
I looked at the code of the plugin:
I propose to choose a way to write in:
In my opinion, in the first version, we need to choose the first way, because it's a good ground to work with a strong topic like real-time editing. |
We could add it here as well. But I would expect that if the users are using our extension, this pop up would never appear because they are working in 2 different versions (and, as you said, in Lab you won't be able to open 2 times the same notebook). About your options, I'm more inclined to the second one. The first one looks like you will have to implement your own algorithm, which I don't think we should do (less things to maintain and less things to break), especially if there's an official package that could do that for us ( Please let me know if my assumptions are wrong. |
True, If the user use cs3api4lab plugin with locking and copy mechanism user don't see this message.
It's better to use the official package, but from my experience integration, it's not easy and sometimes had problems. Probably we need to extend and wrapper 'nbdime' code to working correctly in our code.
I agree to use this way. |
Yes, but it's better to wrap than doing everything from scratch. 🤞 |
TLDR - nbdime code both backend and frontend is reusable, but the front end may not be good enough I understand that our goal is to do as little work as possible. So far, nbdime is the best candidate. When it comes to merging, the nbdime code is reusable, however we will not avoid some adjustments. The question is, how much of the code we want to incorporate. For clarity, the nbdime package consists of several sub-apps, we are especially interested in nbdiff, nbdiff-web, nbmerge, nbmerge-web. Nbmerge performs automatic merge, nbmerge-web provides GUI. Nbmerge is alright for the simplest cases like merging two identical notebooks, where one of them has additional data appended at the end - in other case nbmerge results in unresolved conflicts. I assume that in many cases we would need a GUI for conflict resolution. I think nbmerge-web front end part would also be reusable to great extend, but the question is, wether nbdime-nbmerge-web meet our requirements. To me, it is unintuitive and uninteractive (and unaesthetic). I suggest taking a look at nbmerge-web so we could decide if we want to adjust it to our needs or if it's better to do something new. |
Better implementation should use locking mechanism provided by the cs3 api: cs3org/cs3apis#6 |
Hello @diocas, I'm curently working on the backend part of this task. So far we managed to do the following:
Since we can't differentiate between autosaving and normal saving/closing, the locking mechanism is based on the autosave feature. We are working on a solution that would allow us to know when the file is being closed. Assuming that's possible, the next step would be:
Please let me know what you think about the current and future steps in development. sidenote: in the future, thanks to the local copy, we could consider letting the user continue editing the file even if they exit Jupyter without properly closing/saving |
Hi, terribly sorry it too so long to reply... Question: what do you mean by "ArbitraryMetadata"? Sincerely I don't know why we care if it's autosave or not. If the autosave is not constant (as mentioned above), then I would have a frontend extension pinging the backend to let it know which files were still open. Or we could subscribe to a notebook closed event and notify the backend (to avoid having to wait for the lock timeout; also to avoid having a recurrent ping to the backend). But maybe the first option is more secure. Of course you would have to keep state, but I think this is ok given that these servers only run for a specific user. If you want to have a meeting to brainstorm/discuss something you/I did not understand, please let me know! |
ArbitraryMetadata is a part of files metadata in Reva where you can story any additional information about the file. You can get it back by stating. Yes, I think this is how it works, but in my opinion it fits our intentions, e.g. when someone opens a file, makes some changes and leaves it open, then the lock should be discarded after some time (changes would be saved by an autosave). The default autosave interval is 120s, the time after locks are ignored is set to 150s. Both are configurable. I think .conflict files are a good idea so I've added this functionality and removed the error message. The only problem now is that the user doesn't know that the file has been saved in another location. On the other hand if we keep the message it would be shown on every autosave. Also, I think creating only 1 file with the .conflict extension should suffice, otherwise we would keep creating new file with every autosave, let me know what you think. Also, should we try to open the .conflict file instead of original when we open any shared file? About the rest. We've found it hard to know reliably when the notebooks are closed and I think what we have right now is good enough so I suggest we test and release the locking functionality in its current state, skip the step with local files and move on to start working on merges. We have come up with a proposal as to how merges should work which we would like to present on a meeting. I will post a short summary in a separate comment. In my opinion we shouldn't spend too much time on locks, since they are going to be working quite differently when we implement merging feature. |
Our proposal for the merge functionality:
|
Additionally:
There might be a problem when 2 or more people are merging the file at the same time, we need to test the plugin to see what would be the best solution in that situation |
This has been discussed in a meeting and we will proceed with a discussion to evaluate some possible options. @dagl @piotrWichlinskiSoftwaremind if you can circulate the slides before, it would be helpful. As I said, I'm on holidays next week but will try to be present in the meeting. |
Discussing with @diocas, please have a look at cs3org/cs3apis#160 as that would hopefully provide native methods for this issue. |
I decided to move this issue from swan-cern/jupyter-extensions/issues/26 to this repo, since I believe it should be added to this extension. Some of the initial discussion can be checked in that issue.
While real-time collaborative editing of notebooks is a feature we would like to have, it would be useful to have an intermediate step, easier to accomplish, to allow users to open notebooks concurrently without causing conflicts.
This feature is of interest of JRC and CERN, and it would be useful for the project CS3MESH4EOSC. The current status of collaborative editing of notebooks is still uncertain, with a considerable effort still needed.
This ticket describes a possible approach, combining some of my ideas with the ones presented by JRC. Please let me know if you disagree with any of this.
Approach
I suggest taking a similar approach to the editing of office files. This means, we would lock a notebook if someone else is editing, which would only allow another person to open it in read-only mode. This would prevent the current Jupyter behavior of complaining that a file has changed on the disk and the possible loss off data.
But, we can still allow other's to contribute. We can ask the user if he just wants to open (ro) or if he wants to create a copy and merge it after. This copy should have a specific name, following some convention, that would allow the system to detect it and suggest a merge at a later stage.
Implementation
To make this approach possible, we need to modify the upstream
Contents
interface.Opening a file should provide the following information:
We also need API endpoints to:
This means that we need to create abstract methods for
locking
,unlocking
,getLockStatus
,getConcurrentCopies
,setReadyToMerge
etc.We also need to change the UI to understand this new concept. It needs to understand that, in case a notebook is considered locked, it needs to ask the user to open in ro or create a copy. It also needs to suggest the merge of the changes. This can be done to the current user locking the file in another copy has been marked as "ready to merge", or to anyone else opening the file on which non merged copies still exist.
For the task of comparing notebooks, there's already an official tool called nbdime: https://nbdime.readthedocs.io/en/latest/
The text was updated successfully, but these errors were encountered: