-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI Tweaks, Fix #38, and Single Cache #37
Conversation
Qualify job names so they are easier to read in PRs.
11a667a
to
acbb5a2
Compare
I have a crazy idea. Hashes are timestamped after the prefix we give to it. When fetching a cache, the newest one is selected within the prefix-matching candidates. Why don't we just use a very plain prefix (just OS-toolname, i.e. linux-minion/macos-chuffed etc)? We could extend to OS-compilername-toolname etc if needed. Basically only mention something in the hash key if there is a specific entry for it in the build matrix + the name of the thing we are building. AFAICS the only thing we are losing is if we have version A of something in build 1 then B in build 2 then A in build 3 again, the 3rd build won't be able to find the 1st cache. Since it's not content prefixed, just timestamped. But this is quite unlikely in practice anyway? |
To be very explicit, yes you read it right: I am suggesting we remove all |
I agree - the newest cache on a branch would always be used by Github, and we don't need cache invalidation. Given the caching rules:
My guess is that if (eg) I am working on a PR based on an old Cargo lock file, it would find an older cache from main with that cargo lock file and not use the latest one. However, if I had a new Cargo.lock file in my PR, then it would use the latest one that matches nightly-${{ runner.os }}. If we did not have the first key, every PR using different versions would change the cache and another PR would change it back and so on. Whether this matters or not at this scale is another question (https://stackoverflow.com/a/77194233) |
Hashes are good in general because they allow you to fetch by content as opposed just the most recent. I am just thinking it's very unlikely to exactly match for us. Also, if the second hashFiles matches in your example but the earlier submodule_sha's do not, we have a cache miss anyway, don't we? Maybe we can just try without the hashes and see how it behaves. |
8dd4d30
to
b41c943
Compare
b41c943
to
9bb6f80
Compare
A quick question: do we need to vendor genindex.py or can we set it up in a different way? I couldn't see a link to the source of this file either (there is a name and a license at the top, but I couldn't find the source with this info, yet). |
oh it's this one of course: https://github.com/glowinthedark/index-html-generator/ you linked to it from #36 there are no releases/packages etc. so short of fetching the git repo every time I don't see any other option. we should add the source link clearly somewhere though so we can check for newer versions in the future. |
I'll try this. My assumption was that we could have one cache that all jobs update, giving us a really complete blob of the entire project for future CI, but it seems like caches are read only and can't be updated. |
I imagine with that limitation the way to do a full cache properly is to have a full build CI for the entire project which keeps a full cache of all the project at every commit on main, which can then be used for PRs.... |
6358b6c
to
3470f85
Compare
3470f85
to
9e61f6b
Compare
I don't think I follow this but you have been spending a lot of effort into this so maybe you should set it up in the best way according to your understanding and we can review after that? OK a short version: I thought we could have action A fetch from the cache and push to the cache once finished and this can be (as long as the keys match) fetched by action B later etc. Of course if they are at the same time they won't benefit from the cache updates of each other but that is inevitable. Is this not the case? |
This is what i was hoping for - however, action A could only build minion and not chuffed, and then B wouldnt be able to update the cache with chuffed as the cache is read only and so on. Centralising cache building makes this a bit clearer in my mind - a cache is built for the entire project on main then consumed by actions. This centralised cache forces build of everything in the workspace, so should be complete. Timestamped caches could also solve this somewhat.
Tbh I am not quite sure what's going on or whats best anymore, but I think its better than it was before so could be merged regardless and we revisit this in the future. I've got 3099 deadlines soon so it might be best to get this in now in a good but not perfect state, then revisit it, as opposed to the PR sitting for two weeks... |
In particular, all actions just go fetch the full build of the project from the last commit on main and do stuff based on that. |
8cbc832
to
b1b0e62
Compare
b1b0e62
to
b113639
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good!
Thanks! |
As per #36.
This PR adds / will add: