-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Record print method #3
Conversation
Thanks for solving issues with NULL values! Regarding making the output prettier: I'm all for this. However, the current print message I implemented to align the way that the Python lamin package currently prints information: import lamindb as ln
artifact = ln.Artifact.get("KBW89Mf7")
artifact
# Artifact(uid='KBW89Mf7IGcekja2hADu', version='2024-07-01', is_latest=True, description='Myeloid compartment', key='cell-census/2024-07-01/h5ads/fe52003e-1460-4a65-a213-2bb1a508332f.h5ad', suffix='.h5ad', type='dataset', size=691757462, hash='SZ5tB0T4YKfiUuUkAL09ZA', n_observations=51552, _hash_type='md5-n', _accessor='AnnData', visibility=1, _key_is_virtual=False, created_by_id=1, storage_id=2, transform_id=22, run_id=27, updated_at='2024-07-12 12:40:48 UTC') Imo this print statement could be improved upon a lot, and what you implemented is probably a lot better. Could you show me an example of what it looks like? Should we propose these changes for the upstream lamin python package as well? |
I agree that the lamindb print output for any given record is quite verbose and evidently the The problem is if one omits certain fields, the user might not know how to get them. In some cases, we're now using reduced references that look similar
instead of
Can you show an example for what you did and paste it here? I think it's fair to assume that the typical R user will be a bit less technical but purely wants simple and intuitive UX without needing to know all private fields etc. Hence, I think simplifying what Python does might be a good idea. Or we simplify in Python, too, and only print all info upon WDYT @Zethson @sunnyosun? |
Sorry, I should have posted what it looks like. Here is a screenshot. The I agree it makes sense to have the Python/R output roughly aligned, I should have checked that first. It's also easy enough to add a |
Thanks for sharing! 😄 And yes, uid and hash always exist. I like your suggestion of having a
Can there be a deterministic order of these fields similar to what we have in Python? It's defined here: https://docs.lamin.ai/lamindb.artifact Here is how the implicitly called >>> artifact
Artifact(uid='nL3Bslwl4kIZittG0001', version='1.0', is_latest=True, description='my RNA-seq', suffix='.parquet', type='dataset', size=4091, hash='GDTjY7XSC6E2k9d_ZpLCnw', _hash_type='md5', _accessor='DataFrame', visibility=1, _key_is_virtual=True, created_at=2024-10-01 18:24:56 UTC) And this is how it looks when you call >>> artifact.describe(print_types=True)
Artifact(uid='nL3Bslwl4kIZittG0001', version='1.0', is_latest=True, description='my RNA-seq', suffix='.parquet', type='dataset', size=4091, hash='GDTjY7XSC6E2k9d_ZpLCnw', _hash_type='md5', _accessor='DataFrame', visibility=1, _key_is_virtual=True, created_at=2024-10-01 18:24:56 UTC)
Provenance
.storage: Storage = '/home/runner/work/lamin-docs/lamin-docs/docs/lamin-intro'
.transform: Transform = 'Introduction'
.run: Run = 2024-10-01 18:24:55 UTC
.created_by: User = 'anonymous'
Labels
.cell_lines: bionty.CellLine = 'HEK293'
.ulabels: ULabel = 'Candidate marker study'
Features
'study': cat[ULabel] = 'Candidate marker study'
'temperature': float = 21.6 What I don't like about it is that within In R a good solution seems possible with the What I'd like to draw your attention to is the wealth of information you'd get when you hit But all of this can be iterated on. |
I had a go at updating things with @falexwolf's suggestions. The short output now looks like this: I tried to keep only the things I thought would be useful for users but maybe I missed something important. This is what the detailed output looks like: The provenance information is stored in the object so we could show that for artifacts with a bit more work. The labels and features etc. would require more querying/linking of things which I don't think we have set up yet. Let me know if you have any more comments. |
From my side, this is great for now, @lazappi! Thank you! |
@Zethson @sunnyosun - guys, do you think we should have the simple fields vertically displayed or all in one line as in Python? |
If we can reduce the clutter to ensure that the simple fields in a single line are not that wide, we can go for the Python version. Your recent suggestion of using shorter UIDs helps. Else, I think vertical is better because everyone hates horizontal scrolling. But as usual, consistency is key. Let's keep Python and R a consistent UX (at least as close as possible) and then go from there? |
I really like this PR, but I think it would be best to create a github issue in the Python repository and come back to this when the print function for Python has been improved. Would that make sense? Shall we close this PR when the separate github issue has been created? |
For now we have implemented something more similar to Python in #22. |
Alright, let's close this for now. @lazappi Could you create issue at laminlabs/lamindb with your proposal for what the print would look like? |
There was a big in the print method for the
Record
class where things were mixed up if there was aNULL
value. This fixes that and uses {cli} to make the print output prettier.