Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiome: validation error messages are "lost in all the output" #1277

Open
brianraymor opened this issue Mar 4, 2025 · 0 comments
Open
Labels
curation software tech Tech issues that do not require product prioritization. Tech debt, tooling, ops, etc.

Comments

@brianraymor
Copy link
Contributor

Context

Reported by @brian-mott on sci-data-eng

  • Stdout/logging/errors is very busy. I see that the dask cluster has the parameter to only report at the error level, but I will get info messages as well. The actual validation errors get lost in all the output

  • For the final error messages, it would be very helpful to have them printed at the end of all the logging/message output, one at a time. Currently, there is the logging Error message followed by another error message with just the list of errors:

WARNING:distributed.shuffle._scheduler_plugin:Shuffle 11e5ccfaf3e21e6743b6ff97807ed253 initialized by task ('shuffle-transfer-11e5ccfaf3e21e6743b6ff97807ed253', 20) executed on worker tcp://127.0.0.1:46761
WARNING:distributed.shuffle._scheduler_plugin:Shuffle 11e5ccfaf3e21e6743b6ff97807ed253 deactivated due to stimulus 'task-finished-1741116843.6624162'
ERROR:cellxgene_schema.atac_seq:Errors found in Fragment and/or Anndata file
ERROR:cellxgene_schema.atac_seq:[None, None, None, None, None, None, 'Anndata.obs.is_primary_data must all be True.', 'Fragment file has duplicate rows.']
INFO:distributed.scheduler:Remove client Client-9d954f19-f92f-11ef-b124-060821df02b7
INFO:distributed.core:Received 'close-stream' from tcp://127.0.0.1:44572; closing.
INFO:distributed.scheduler:Remove client Client-9d954f19-f92f-11ef-b124-060821df02b7
@brianraymor brianraymor added curation software tech Tech issues that do not require product prioritization. Tech debt, tooling, ops, etc. labels Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
curation software tech Tech issues that do not require product prioritization. Tech debt, tooling, ops, etc.
Projects
None yet
Development

No branches or pull requests

1 participant