Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancements to feedstocks' verification and validation workflows #8

Open
3 tasks
jaimergp opened this issue Dec 6, 2022 · 3 comments
Open
3 tasks

Comments

@jaimergp
Copy link
Contributor

jaimergp commented Dec 6, 2022

Primarily focused on performance and reliability.

📌 Summary

Improve the artifact verification and validation workflows as they are moved from cf-staging to conda-forge.

📝 Background

conda-forge feedstocks are repositories equipped with the build machinery provided by conda-smithy.

When a PR is merged to a feedstock branch, the resulting conda packages are not uploaded directly to conda-forge.
They are first placed in a staging channel named cf-staging.
The artifact validation bot hosted in a Heroku instance downloads the artifacts, runs some analysis and if successful, then they are copied to the actual conda-forge channel.

The analysis includes checks like:

  • File clobbering: is the package producing files that belong to another package? This is reported as an issue
  • Package name squatting: is the feedstock producing packages that belong to other feedstocks? If true, it prevents the upload.
    • This check is needed to work around the lack of permission granularity in anaconda.org

Depending on the size of the package, this causes some strains on the already overworked Heroku dyno.

🚀 Tasks / Deliverables

  • Profile load and usage, and decide if the current implementation needs performance improvements
  • Consider hardware alternatives beyond a single machine: a multi-worker approach with auto-scaling
  • Perform risk analysis and decide if other validation aspects are needed, if server load stopped being a bottleneck

📅 Estimated completion

This task should be finished in the first 18 months.

ℹ️ References

@jaimergp
Copy link
Contributor Author

jaimergp commented Feb 22, 2024

Update: conda-forge/artifact-validation is now archived. Need to update how validation is set up now.

Update 2: Artifact validation is currently disabled. Need to gather more context on why and whether having it back is a realistic assumption.

@jaimergp
Copy link
Contributor Author

File clobbering: is the package producing files that belong to another package? This is reported as an issue

This probably will need #54 to provide easy file-ownership checks. Note that some packages do need to clobber others (e.g. different variants of the same interface like BLAS stuff), but usually there's metadata we can use to infer that kind of thing (run_constrained, common mutex, etc).

@jaimergp jaimergp moved this from 📋 Backlog to ✋ On hold in czi-conda-forge 📦 Feb 27, 2024
@jaimergp jaimergp modified the milestones: 18 months, 24 months Apr 30, 2024
@jaimergp
Copy link
Contributor Author

With the sqlite dumps in https://github.com/Quansight-Labs/conda-forge-paths we could now think of a setup where the validation action downloads the dump and queries it locally to see if there's clobbering or not, and warn if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: ✋ On hold
Development

No branches or pull requests

1 participant