Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collection Creator Usability: Warn on mixed datatype collections #19277

Open
jmchilton opened this issue Dec 6, 2024 · 3 comments
Open

Collection Creator Usability: Warn on mixed datatype collections #19277

jmchilton opened this issue Dec 6, 2024 · 3 comments

Comments

@jmchilton
Copy link
Member

Like empty collections, there are times you might want to have a mixed datatype collection but 99+% percent of the time I think collections should be uniformly datatyped. It would be nice if the collection creators issued a little warning if the datatypes don't match along the lines of the excellent work of @ahmedhamidawan in #19250.

There may be a few more usecases for mixed dbkey collections but it would be great if there was some sort of lighter weight indicator about that too. Maybe the thing for both dbkey and extension is that they should be highlighted in the UI if they are mixed across the collection - so right now we show the extension for every element in a list - maybe only do that if they are different with a little warning next to them and then match that for dbkey?

@ahmedhamidawan
Copy link
Member

ahmedhamidawan commented Jan 13, 2025

Is there a particular message that would work better than the following?:

firefox_jiz6oIxxFM

@jmchilton ?

@jmchilton
Copy link
Member Author

I would use different language yeah, the history panel and dataset info call these format instead of extension or datatype. "Datasets have differing formats, generally {collectionType}s should contain datasets of all the same type."

Maybe as a second level - two helptext link overs could be added. On format that describe it as having different names in Galaxy (dataytpe, format, extension) and explain the difference between file extensions of dataset datatypes and then reuse this help text in the history panel format label and dataset info format label. And a link out at the end of your - saying maybe "Why?" that on hover explains why dataset collections are not folders. I asked ChatGPT for some language and sharpened it to something like:

Dataset collections are designed to streamline the analysis of large numbers of datasets by grouping them together into a single, manageable entity. Unlike generic folders on your computer, which can hold any mix of file types, dataset collections are specifically intended to be homogenous. This homogeneity is crucial for consistency in processing. Homogeneous datasets ensure that each dataset in the collection can be processed uniformly with the same tools and workflows. This eliminates the need for individual adjustments, which can be time-consuming and prone to error. Most tools and workflows in Galaxy are designed to operate on collections of similar data types. Homogeneous collections allow these tools to operate uniformly over the collection.

@ahmedhamidawan
Copy link
Member

Cool, very helpful, thanks! I will try some of this stuff out in the linked PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants