-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get closest commit #389
Get closest commit #389
Conversation
Thank you for this I'll have to carefully test it, but quickly, the issue with riverplot is that it was removed from CRAN as was liger (the package pulling it), so this is something that cannot be resolved, also outside of Nix, to fix this, riverplot should be listed as a remote so it's pulled from github instead of CRAN |
here is the description file of liger at that revision where riverplot is in the imports you either need to use a more recent version that doesn't rely on riverplot anymore or use an older date for your environment that includes riverplot |
using the date 2022-01-16 will likely solve this part of the puzzle |
This is just a toy example and definitely more complex than my other PR, so please revise the other one first. Ah, this makes sense. So liger at that time required CRAN installation, which worked at that time, but since it's removed from CRAN this is problematic. I thought about it again. Wasn't the mistake that I specified SO then it makes more sense to use this, right?
Here is the Then riverplot at that time (so version 0.10? and still available at CRAN https://cran.r-project.org/src/contrib/Archive/riverplot/) should be used? However, this then fails with:
This probably has to do with lines 83 ff, which is harmony, similar to here #388 . However, if I remove harmony and put it into propagatedBuildinputs i get the following issue with Rhdf5 lib, which i cannot solve
|
Note: It was really painful to debug using |
There are several things that need to be fixed: I realize that by backporting fixes to rhdf5lib, I actually broke it instead 😄 so now I’m trying to fix that Then there seems to be an issue with one of the dependencies listing R itself:
as you can see, R is listed, where it shouldn’t be. There is code in rix to handle this, so I’m surprised it doesn’t seem to work in this case, or is not being called at all. I’ll need to check this. |
this is fixed now, so the remaining issue would be to understand why R is listed as a dep, when it should be removed. |
Just out of curiosity: could you link how you fixed rhdf5 and why harmony fails |
check the commit history of today: https://github.com/rstats-on-nix/nixpkgs/commits/2022-01-16/ it’s a bit messy, I did from the phone on the train this morning for harmany, no idea, I’ll have to dig deeper |
yeah agree the Rhdf5 error is gone. So running the following code with this PR:
after removing double entry of seurat-data and putting harmony into propagatedBuild now fails with:
So probably the same issue as in #390 and not directly related to this PR because it also occurs with rix before this PR. |
Would it be possible to not use gh, but invite users to configure the token instead? Using calls to curl should be ok, right? Adding dependencies to rix is not something we really like to do. |
do the tests pass locally ? |
I reviewed the git diffs and saw that i accidently removed rmeote-pkgs_names_and_refs. Tried to fix it. |
so checks looks better now, still there is some issue with download_all_commits. Always falls back to head, need to further debug this. But in general: Also created a check for github_path so that it uses the github token if set (validating the token with a regex stolen from here (https://gist.github.com/magnetikonline/073afe7909ffdd6f10ef06a00bc3bc88), and otherwise uses unauthorized API requests. I am still not 100% sure about the limit. Now it's maximum 300 commits (i think that's 3 requests). If you don't have you github token set, this already uses 50% of your available requests only with seurat-wrappers. If you would set the limit even higher (e.g. to 1000 commits), this could take some time and would lead to problems with unauthorized requests. |
Wow, curl really caused me a lot of headaches. There was some issue with curl_download leading to this error:
I now use curl_fetch_memory which seems to work fine in my manual tests. I also realized that I could only see warnings in debugging mode. So when using |
Thank you for your efforts! It would be nice to understand why warnings aren’t being shown though. I’ll try to take a look soon. |
I started adding tests. This works fine locally. The tests also pass, but if you look at the logs on github you can see that the tests fails. On macos-latest:
Probably this is because Sys.getenv("GITHUB_PAT") does not work with github actions? BTW, on ubuntu-latest there are severy other tests that fails, unrelated to this PR (excerpt):
|
@b-rodrigues regarding github token with github actions. I used
And now the tests also seem to pass.
Now, 2 failes tests, not caused by this PR:
|
I also added tests for download_all_commits function that seem to work. Thereby, detected a bug in download_all_commits (was not having 100 not 300 commits ,, so yeah testing is useful :)) and tried to increase performance by pre-allocation (with the help of Clause 3.5), although most time is spend on the API request I think. |
These tests are failing as expected, as the packages were pointing to HEAD, but now to the latest commit. Also, we need a failsafe for the case that was happening in CI: if no PAT is configured and the api can't be accessed, then the user should be informed |
Actually no, they're now pointing to HEAD whereas they weren't before. So I guess some failsafe is kicking in where the commit is not being used and overwritten with HEAD |
Also if tests are green on main but not in a PR, then it's for sure related to the PR, even if it doesn't look like it 😉 side effects are a bitch! |
Well if not PAT is configured, it will still work, but now as an unauthorized request with 50 requests per hour And there's the message
|
well but all tests (except lintr) are green. But if you open them you can still see tests failing. |
But okay, you are right, there are not failing tests in main. |
also added test for resolve_package_commit, now everything should be tested |
ahh, i now get why they are failing. |
Now all tests pass 💯
What I did: |
download commits until target date is reached or until 1000 commits are downloaded
f24383d
to
2669d80
Compare
@b-rodrigues rebasing was also a hassle, I hope I didn t mess anything up. And no tests fail :) |
Many thanks, this is really a great addition to the package! |
This is great news. You are welcome. |
I'm in the process of drafting something |
Yes pretty good, some suggestions here #408 |
@b-rodrigues as you know i really like the automatic handling of remote dependencies. However, I wasn't aware of the caveats at the time, when we were working on that. In your docs you now clearly explain:
This makes sense. However, this is not a good default, as it's quite dangerous I think.
With this PR, I tried to get the commit that is closest to the date of the commit hash given (only earlier in time not later).
This actually turned out much more complex than what I anticipated, but with some help of ChatGPT, Claude Sonnet I have a working solution. However, this involves quite some changes so please carefully check. This also needs some formal testing I think.
Al of this relies on api github naturally. In the beginning i used your
try_download
function, but then ChatGPT told me that is more efficient to just usefromJson
as it does not require creating files, downloading, deleting etc.Since this requires quite some API calls, especially if there are many remote dependencies (as in
seurat-wrappers
which i used for testing) and many commits (espeically step2), I hit the API limit when testing. I though about changing togh
which automatically used your GitHub token if i understood correctly (then you have 5000 instead of 60 requests per hour). But that would have involved a new package, so for now I sticked withfromJson
. I limited API request by only downloading 300 commits max.I used the following example for testing
Algorithm:
Step 1 get the date of the commit has given
This is done by the new function
get_commit_date
. Pretty simple. In my example I used, it correctly identified the date of commit5b63c23b211002611b0ccc33da009237d0dbafb6
as2022-05-25T21:58:55Z
. I usedtryCatch
and this faills back to the current date.Step 2 download all commits for each remote dependency if no ref is given
This is the most complex step and can require a lot of API calls. This is done with the new
download_all_commits
function. This is used in the newrevolsed_package_commit
function. This is done for each remote dependency inget_imports
in
lapply
inget_imports
and integrated in thefor each remote dependency, yet only if there is not ref given. By default only 30 commits were shown, so this requires looping through multiple pages, but I limited it to 300 commits because of API limits. This falls back to
HEAD
.Step 3 get closest commit of each remote dependency
Get the closest commit (earlier not later) of each package. This is done in the new function
get_closest_commit
. Nothing fancy here.So e.g for the demote dependency monocle I get
Here's the entire default.nix that I get with the above rix call.. default.nix.log
This still fails due to
seurat-disk
appearing twice. This should be fixed with my other PR (#388) although I did not test it yet because I didn't want to mix PRs (maybe you can first review the other PR, then I can also test this). Then onlyseurat-data
with commit hashd6a8ce61ccb21a3b204f194d07009772c822791d
should be present.However, even if I remove the other
seurat-data
entry manually,nix-build
fails withThis is then Nix, where I don't really know how to debug this.