Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize Repository Forking Process #104

Open
gounthar opened this issue Jul 17, 2024 · 12 comments
Open

Optimize Repository Forking Process #104

gounthar opened this issue Jul 17, 2024 · 12 comments
Labels
enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted good first issue Good for newcomers

Comments

@gounthar
Copy link
Collaborator

gounthar commented Jul 17, 2024

What feature do you want to see added?

Current Situation

When forking a repository on GitHub, our current process retrieves all branches from the original repository.

Issue

This approach may be inefficient and unnecessary for our use case.

Desired Outcome

We aim to fork only the default branch of the repository, as this is typically sufficient for our work.

GitHub GUI Option

The GitHub web interface provides an option to fork only the default branch (usually the main branch).

JGit Consideration

It's unclear whether JGit, the Java library we use for Git operations, offers a similar option to fork only the default branch.

Proposed Solution

  1. Investigate if JGit provides functionality to fork only the default branch.
  2. If available, implement this feature in our forking process.
  3. If not available through JGit, explore alternative methods to achieve this goal.

Benefits

  1. Efficiency: Reduces unnecessary data transfer and storage.
  2. Simplicity: Maintains a cleaner repository structure in our forks.
  3. Focus: Aligns with our primary need of working with the main branch.

Next Steps

  1. Research JGit documentation and community resources for potential solutions.
  2. If no direct JGit solution exists, consider implementing a custom approach to achieve the desired outcome.
  3. Test and validate any new implementation to ensure it doesn't introduce unforeseen issues.

Conclusion

Optimizing our forking process to retrieve only the main branch will streamline our workflow and improve overall efficiency. This change aligns with best practices for managing forks when only the main branch is needed for development or analysis purposes.

Upstream changes

No response

Are you interested in contributing this feature?

No response

@gounthar gounthar added the enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted label Jul 17, 2024
@lemeurherve
Copy link
Member

lemeurherve commented Jul 17, 2024 via email

@gounthar
Copy link
Collaborator Author

Yes indeed.

@jonesbusy jonesbusy added the good first issue Good for newcomers label Sep 7, 2024
@SohamJuneja
Copy link
Contributor

@gounthar I explored about JGit , their docs and community forum and this can be done by JGit , it offers an option like this:
CloneCommand.setCloneAllBranches(false)

@gounthar
Copy link
Collaborator Author

gounthar commented Dec 3, 2024

Thanks @SohamJuneja. 👍
I think @jonesbusy implemented it since then. 🤔

@SohamJuneja
Copy link
Contributor

@gounthar ohh , didnt check this🥹 , why's the issue still open then? Is there anything needed to be done here?

@gounthar
Copy link
Collaborator Author

gounthar commented Dec 3, 2024

My bad, I just launched the tool on pipeline-graph-view, and it forked with the 12 branches. 🤷
https://github.com/gounthar/pipeline-graph-view-plugin

@SohamJuneja
Copy link
Contributor

I guess there's a simple fix , just need to add one line in GHServiceTest.java
doReturn(cloneCommand).when(cloneCommand).setCloneAllBranches(false);
in cloning commands.

@jonesbusy
Copy link
Collaborator

This is not implemented.

For example we still fork the repo even if there is no changes to push.

So we need only fork before pushing changes (and changing the remote before pushing).

@jonesbusy
Copy link
Collaborator

My bad, I just launched the tool on pipeline-graph-view, and it forked with the 12 branches. 🤷

This is an other optimization I guess. When forking, we probably need to sync only the default branch. Not sure if it's implemented on https://github.com/hub4j/github-api

The GitHub allow it : https://docs.github.com/en/rest/repos/forks?apiVersion=2022-11-28#create-a-fork default_branch_only

So if it's not available on hub4j/github-api this must be fixed upstream

@SohamJuneja
Copy link
Contributor

@jonesbusy So , the jgit solution I mentioned won't work here?

@jonesbusy
Copy link
Collaborator

@jonesbusy So , the jgit solution I mentioned won't work here?

What do you mean ? I would suggest to study the actual solution and understand everything that can be optimized like branches, forking etc...

This issue is just an holder for all optimization related to repository forking, cloning etc...

@SohamJuneja
Copy link
Contributor

Thank you for pointing that out. I’ll take a deeper look at both hub4j/github-api and the GitHub REST API for optimizations, including the default_branch_only feature. My initial question was about whether the JGit solution : setCloneAllBranches(false)) could be applied or adapted for use with the hub4j/github-api or if the REST API is a better direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement For changelog: Minor enhancement. use `major-rfe` for changes to be highlighted good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants