Collection Creator Usability: Warn on mixed datatype collections #19277

jmchilton · 2024-12-06T15:21:45Z

Like empty collections, there are times you might want to have a mixed datatype collection but 99+% percent of the time I think collections should be uniformly datatyped. It would be nice if the collection creators issued a little warning if the datatypes don't match along the lines of the excellent work of @ahmedhamidawan in #19250.

There may be a few more usecases for mixed dbkey collections but it would be great if there was some sort of lighter weight indicator about that too. Maybe the thing for both dbkey and extension is that they should be highlighted in the UI if they are mixed across the collection - so right now we show the extension for every element in a list - maybe only do that if they are different with a little warning next to them and then match that for dbkey?

ahmedhamidawan · 2025-01-13T18:59:32Z

Is there a particular message that would work better than the following?:

@jmchilton ?

jmchilton · 2025-01-16T17:48:02Z

I would use different language yeah, the history panel and dataset info call these format instead of extension or datatype. "Datasets have differing formats, generally {collectionType}s should contain datasets of all the same type."

Maybe as a second level - two helptext link overs could be added. On format that describe it as having different names in Galaxy (dataytpe, format, extension) and explain the difference between file extensions of dataset datatypes and then reuse this help text in the history panel format label and dataset info format label. And a link out at the end of your - saying maybe "Why?" that on hover explains why dataset collections are not folders. I asked ChatGPT for some language and sharpened it to something like:

Dataset collections are designed to streamline the analysis of large numbers of datasets by grouping them together into a single, manageable entity. Unlike generic folders on your computer, which can hold any mix of file types, dataset collections are specifically intended to be homogenous. This homogeneity is crucial for consistency in processing. Homogeneous datasets ensure that each dataset in the collection can be processed uniformly with the same tools and workflows. This eliminates the need for individual adjustments, which can be time-consuming and prone to error. Most tools and workflows in Galaxy are designed to operate on collections of similar data types. Homogeneous collections allow these tools to operate uniformly over the collection.

ahmedhamidawan · 2025-01-16T17:55:45Z

Cool, very helpful, thanks! I will try some of this stuff out in the linked PR.

jmchilton added kind/enhancement area/UI-UX area/dataset-collections labels Dec 6, 2024

ahmedhamidawan self-assigned this Dec 10, 2024

ahmedhamidawan mentioned this issue Jan 13, 2025

[24.2] Show message for mixed extensions in collection creator #19404

Draft

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collection Creator Usability: Warn on mixed datatype collections #19277

Collection Creator Usability: Warn on mixed datatype collections #19277

jmchilton commented Dec 6, 2024

ahmedhamidawan commented Jan 13, 2025 •

edited

Loading

jmchilton commented Jan 16, 2025

ahmedhamidawan commented Jan 16, 2025

Collection Creator Usability: Warn on mixed datatype collections #19277

Collection Creator Usability: Warn on mixed datatype collections #19277

Comments

jmchilton commented Dec 6, 2024

ahmedhamidawan commented Jan 13, 2025 • edited Loading

jmchilton commented Jan 16, 2025

ahmedhamidawan commented Jan 16, 2025

ahmedhamidawan commented Jan 13, 2025 •

edited

Loading