Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that auto-knitting works with db$track() in RStudio #110

Open
falexwolf opened this issue Nov 25, 2024 · 19 comments
Open

Ensure that auto-knitting works with db$track() in RStudio #110

falexwolf opened this issue Nov 25, 2024 · 19 comments
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@falexwolf
Copy link
Member

falexwolf commented Nov 25, 2024

If auto-knitting produces an .html "on-the-fly" and at the point at which db$finish() is called there is a file in the working directory called whatever.html, then we can bypass lamin save and simply push that file during db$finish().

@rcannood
Copy link
Collaborator

rcannood commented Nov 25, 2024

FYI: When using an R notebook called foo.Rmd, a file called foo.nb.html will be created while you're working on the notebook.

Screenshot from 2024-11-25 10-50-58

If we want to support this, I think it's something that needs to be handled at the python client side.

@falexwolf
Copy link
Member Author

This should be fixed on this branch of lamindb.

@lazappi, can you please try it out by checking out that branch for a trivial example notebook on laminlabs/lamindata?

@falexwolf
Copy link
Member Author

A friendly reminder, here. I'd love to let them know that this is supported now.

@falexwolf
Copy link
Member Author

Apologies, @lazappi & @rcannood, for being so insistent on asking to try this out.

I hope it'd just take 10 min and then it could go in a release and we could let them know that auto-knitting now works.

If there is no time today, maybe tomorrow morning?

@lazappi
Copy link
Collaborator

lazappi commented Nov 26, 2024

I'll try to have a look in the morning. I'm not really familiar with the auto notebook stuff so it might take a bit of messing around.

@falexwolf
Copy link
Member Author

Thanks, Luke! 🙏

@rcannood
Copy link
Collaborator

I'm also happy to take a look during lunch -- my morning is booked unfortunately 😊

@lazappi
Copy link
Collaborator

lazappi commented Nov 27, 2024

Ok, I have tested and this works if you use the autoknit branch https://lamin.ai/laminlabs/lamindata/transform/CaRzLv8UFvFX0001.

@falexwolf
Copy link
Member Author

Great!

And just to confirm: you did not call lamin save on the command line?

@lazappi
Copy link
Collaborator

lazappi commented Nov 27, 2024

Nope. Just db$finish().

@falexwolf
Copy link
Member Author

Great!

This is the easier workflow then and we should highlight it on get-started. It saves the user going back to the CLI and there is then also no danger with the hash changing during "knitting" and "calling db$finish()".

But there seems to be a hiccup if I look closely. Do you understand why this shows the error message and not a clean db$track()?

image

And why do you have two versions of this notebook? It looks like the first one was created accidentally?

image

Do you have an idea for what in the user flow causes these hiccups?

Note: A minor thing is this logging message which says "scripty.py", which I'll fix right away on the branch:
image

@lazappi
Copy link
Collaborator

lazappi commented Nov 27, 2024

But there seems to be a hiccup if I look closely. Do you understand why this shows the error message and not a clean db$track()?

Not sure exactly but I think there might be some tricky things about when the source .Rmd file is saved compared to when the .html is updated. I'm not sure exactly how that all works though.

And why do you have two versions of this notebook? It looks like the first one was created accidentally?

In the first one the .html wasn't being updated because I was doing something wrong that caused a error and I didn't notice. That's my fault and (probably) shouldn't happen to users.

@falexwolf
Copy link
Member Author

falexwolf commented Nov 27, 2024

Ok, I see.

The source code that got saved is correct:

% python
Python 3.10.15 | packaged by conda-forge | (main, Oct 16 2024, 01:24:20) [Clang 17.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import lamindb as ln
→ connected lamindb: laminlabs/lamindata
>>> transform = ln.Transform.get("CaRzLv8UFvFX0001")
>>> print(transform.source_code)
---
title: "Test notebook"
output: html_notebook
editor_options: 
  chunk_output_type: inline
---

A test notebook used to check if tracking an interactive notebook works.

```{r}
reticulate::use_virtualenv("ret-lamin-dev")
library(laminr)
db <- connect()
db$track("CaRzLv8UFvFX0001", path = "test-notebook.Rmd")
db$Artifact$get("mePviem4DGM4SFzvLXf3")
db$finish()

But for the report it logged "report is already saved" and it seems it didn't update the report 🤔 -- hence the report that's uploaded is outdated and has the old uid in it.
image

@falexwolf
Copy link
Member Author

Is it possible that the autoknitted html is a bit behind the .Rmd?

Because lamindb makes sure that the hash is up-to-date. Meaning, the logging message "report is already saved" only gets printed if the .nb.html file and the source code in lamindb have the same hash (are the same):

https://github.com/laminlabs/lamindb/blob/0732cecf8ad04460b13b0035684cefdc79baee09/lamindb/_finish.py#L215-L226

I fear these nuances are tricky for you to comment on and likely sound like gobbledegook.

Can we have brief Zoom call to go through the UX and make sure that report and source code are up to date? One thing we could do is check for the timestamp of the .nb.html file and make sure it's just been saved by RStudio. But that would throw an ugly error and I wouldn't know what to put in the error message.

@falexwolf
Copy link
Member Author

Made both fixes:

But we should try this out and ideally also try to get rid of the manually passed path argument so that things become really smooth and enjoyable.

@falexwolf
Copy link
Member Author

Let me know if one of you has time today or tomorrow to polish these things!

Given it's Thanksgiving tomorrow and we'll only release tonight at the earliest there is no longer so much urgency. Let's focus on getting the UX right and also account for all edge cases.

https://calendly.com/falexwolf/45flex

@lazappi
Copy link
Collaborator

lazappi commented Nov 27, 2024

I booked a time for tomorrow. Trying to make sure the notebook is saved before running db$finish() will probably help but there might be other things going on, I'm not sure.

@falexwolf
Copy link
Member Author

Thanks! Let's test it out.

Trying to make sure the notebook is saved before running db$finish()

That's already done on the branch. I added a check for the time stamp.

@lazappi
Copy link
Collaborator

lazappi commented Dec 4, 2024

Should be fixed by #120

@lazappi lazappi added the enhancement New feature or request label Jan 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants