-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for Local and Remote Repositories #179
Conversation
I’m planning to update the tracing method based on these settings next, just to be sure it aligns with the developers' preferences, and also, to avoid complicating the review process. Looking forward to your feedback. relate issue: #153 |
When I use the Local Repo in this code, It doesn't work as I expected. It seems that the repo = LeanGitRepo("lean4-example", "7b6ecb9ad4829e4e73600a3329baeb3b5df8d23f")
theorem = Theorem(repo, "Lean4Example.lean", "hello_world")
with Dojo(theorem) as (dojo, init_state):
print(init_state) When using |
These are not implemented yet.
I have only modified the |
The basic idea is to replace the GitHub object with a git object. Additionally, we need to change the tracing process, which currently retrieves information via the GitHub API, to a direct way. There are intricate details to adjust, and it may take some time to finalize everything. For now, I've marked it as a draft PR. |
I am updating the tracing and interaction code by python debugging and accompanying tests. For the sake of a smoother review process, it may be prudent to handle the integration in smaller, manageable segments. If it isn't too cumbersome, we can also review everything after all updates are made. I will periodically write tests to ensure compatibility and prevent the introduction of bugs. |
Hi @Iheadx @darabos , please take a look at this. ;) An example of the test: repo = LeanGitRepo(url=LOCAL_REPO_PATH, commit="main")
assert repo.repo_type == 'local'
theorem = Theorem(repo, "Lean4Example.lean", "hello_world")
# initial state
dojo, state_0 = Dojo(theorem).__enter__()
assert state_0.pp == 'a b c : Nat\n⊢ a + b + c = a + c + b'
# state after applying a tactic
state_1 = dojo.run_tac(state_0, "rw [add_assoc]")
assert state_1.pp == 'a b c : Nat\n⊢ a + (b + c) = a + c + b'
# state after applying the 'sorry' tactic
assert dojo.run_tac(state_1, "sorry") == ProofGivenUp()
# concluding the proof
final_state = dojo.run_tac(state_1, "rw [add_comm b, ←add_assoc]")
assert isinstance(final_state, ProofFinished) All test cases have passed on my machine, ubuntu22. ![]() Network error for one case: ![]() |
Thank you. This is really helpful! I'll be able to take a look later this week or in the weekend! |
hi, @RexWzh Thanks for your contribution! |
Yes @LuoKaiGSW , you can use a local Git repository by passing the path in the See test examples here. import os
# Avoid using remote cache
os.environ['DISABLE_REMOTE_CACHE'] = 'true'
lean4_example_url = 'https://github.com/yangky11/lean4-example'
local_repo_path = "lean4-example"
clean_clone_and_checkout(lean4_example_url, local_repo_path, 'main')
local_repo = LeanGitRepo(url=local_repo_path, commit="main")
from_path_repo = LeanGitRepo.from_path(local_repo_path) In general, you don't need to connect to GitHub at all when working with repositories of type However, if the lean version cannot be obtained from the config = self.get_config("lean-toolchain")
toolchain = config["content"]
m = _LEAN4_VERSION_REGEX.fullmatch(toolchain.strip())
if m is not None:
lean_version = m["version"]
else:
lean_version = get_lean4_version_from_config(toolchain) A note on repository types: The
|
hi, @RexWzh Thanks for your reply!
error
system info |
I've tested on both mac and Ubuntu and couldn't reproduce the issue you mentioned. Please ensure that your cloned repository is up-to-date. Mac: ❯ sw_vers
ProductName: macOS
ProductVersion: 14.5
BuildVersion: 23F79
❯ ls
LICENSE docs lean4-example pyproject.toml src
README.md images mypy.ini scripts tests
❯ python
Python 3.10.14 (main, May 6 2024, 14:47:20) [Clang 14.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lean_dojo import LeanGitRepo
>>> LeanGitRepo.from_path('lean4-example')
LeanGitRepo(url='/Users/wangzhihong/workspace/LEAN/LeanDojo/lean4-example', commit='7b6ecb9ad4829e4e73600a3329baeb3b5df8d23f')
>>> LeanGitRepo('lean4-example', 'main')
LeanGitRepo(url='/Users/wangzhihong/workspace/LEAN/LeanDojo/lean4-example', commit='7b6ecb9ad4829e4e73600a3329baeb3b5df8d23f')
>>> Ubuntu: ❯ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.4 LTS
Release: 22.04
Codename: jammy
❯ python
Python 3.10.14 (main, May 6 2024, 19:42:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from lean_dojo import LeanGitRepo
>>> LeanGitRepo('lean4-example', 'main')
LeanGitRepo(url='/home/cuber/zhihong/LeanPorjects/lean4-example', commit='7b6ecb9ad4829e4e73600a3329baeb3b5df8d23f')
>>> LeanGitRepo.from_path('lean4-example')
LeanGitRepo(url='/home/cuber/zhihong/LeanPorjects/lean4-example', commit='7b6ecb9ad4829e4e73600a3329baeb3b5df8d23f') |
I encounter the following error when testing this PR. Starting from an empty cache, I ran: from lean_dojo import *
repo = LeanGitRepo("https://github.com/leanprover-community/mathlib4", "29dcec074de168ac2bf835a77ef68bbe069194c5")
trace(repo) It successfully traced the repo but saved the traced repo to Then, I exited Python, started a new session, and ran the above code again. Instead of re-using the traced repo in the cache, it started to trace the repo again. |
I tried tracing https://github.com/yangky11/lean4-example and had a similar issue. ![]() |
Thanks for pointing that out. I noticed the issue, and it is fixed in the commit use tempdir for local type & resolve cache problem. # Remove cache directory
import os
os.environ['NUM_PROCS'] = '128'
os.environ['DISABLE_REMOTE_CACHE'] = 'true'
from lean_dojo import LeanGitRepo, trace, Dojo, Theorem
repo = LeanGitRepo('/home/cuber/zhihong/LeanPorjects/leandojo/lean4-example', 'main')
trace_repo = trace(repo) The cache file is located at: ❯ ls ~/.cache/lean_dojo
leandojo-lean4-example-7b6ecb9ad4829e4e73600a3329baeb3b5df8d23f However, I did not run tests for this commit and later ones. I believe it would be more effective to split these changes into smaller PRs. |
I would like to know which branch can be used to load the local git repository. My current branch situation is as follows, and I am still encountering the same error, Thank you!
|
elif os.path.exists(parsed_url.path): | ||
return "local" | ||
else: | ||
logger.warning(f"{url} is not a valid URL") | ||
|
||
@cache | ||
def get_repo_info(path: Path) -> Tuple[str, str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with working_directory(path)
is still needed in this function? There is no guarantee the CWD will be the same as path
when calling it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the same error as @LuoKaiGSW and think this is the reason.
Thanks a lot @RexWzh. It works and I can trace the local repo! I've decided to temporarily use this version of the code. |
@LuoKaiGSW Thanks for assisting with testing the commit. Please note that this branch is currently under development and might be unstable until all the code has been thoroughly reviewed. I'll fix that in later commits. |
Hi @RexWzh, is this PR ready for further review and testing? |
Yes, tests are underway and most of the repositories pass successfully, except for mathlib4, which fails at times due to network issues. |
Got this when running one of the tests:
|
I cannot reproduce the above error reliably (probably a network issue). Running other tests now... |
All other tests were successful. I merged it into |
Merged into |
Hi,
Thanks for the awesome project!
This PR enhances the
LeanGitRepo
class to support local and remote repositories, introducing arepo_type
property (github, local, remote), simplifying the_to_commit_hash
method, transitioninglean_version
to a string for clarity, and adding tests for the new repository configurations.A Quick Review
We transition from
pygithub
objects togitpython
for managing repositories.For compatibility, a new property
repo_type
is introduced in theLeanGitRepo
class, supporting three values:github
,local
, andremote
. I've adapted the APIs to accommodate these types:LeanGitRepo.repo
LeanGitRepo.url
It's important to note that while the GitHub object connects to an online repo, the Git object interacts directly with a local git directory.
Please let me know if any further adjustments are needed.