-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Same-named local subrepos with different origins cannot be converted due to ambiguity in the mapping #188
Comments
When extending the mapping file format, be sure to also handle the case when a subrepo is changed without a change in the current branch. I'm thinking of a case when, for example, you have been using an internally generated conversion of a Subversion repo, but when upstream switched to Mercurial, you started using that repo directly. |
I'm making my way through this. It's doable without too much surgery so far as I can tell. I have @frej Do you have an example repo for the scenario you describe? |
@jdanielp: Sorry, no I don't have an example. Another thing, make sure that you stay backwards compatible so that existing incremental setups are not broken. |
I was able to get a solution in place this morning. That solution works for the example repository attached on this issue. Then I realized that I had to get my validation script working so I could validate bigger repos automatically rather than manually spot checking a few commits. I have that script working now (hopefully), but it's late, so I will verify my changes work Wednesday. The basic idea of my changes is that I allow the user to specify a json file for With json, the user can define a per-branch mapping in a file format that lends itself to nested dicts. For example, here is the mapping file for the srt_ambiguous example repo:
I did something special in that file. You'll notice that the first entry is an empty string for the branch name. That's intentional. A user might have only one branch that uses a different url. Therefore, I defined "" to be the fallback. If a branch name does not exist in the keys for the sub-dict Also, you don't have to have a mapping per branch: A made-up example:
Any consumers of Edit: I misspoke on Unless I did something completely boneheaded, this should all be backwards compatible with the old mapping format and behavior. If the user uses anything but a .json, it will use the old I'm not sure what you mean by 'existing incremental setups'. I'm asking to make sure I don't ignore or misunderstand. Do you mean scripts that used to work before should still work now? Then yes, my changes will definitely accommodate. One last thing I remembered - I'll obviously need to update the README-SUBMODULES.md once we've agreed on the strategy. I'll hold off on that until we get the code in a good place. I know this is alot of info. I should probably have made you a PR to look at (if only just for the preliminaries). I'm a newbie with PRs or I'd just do one now. I'm assuming I would just keep updating the branch for the PR as we go back and forth. Like I said - total noob when it comes to PRs. I need to go slow so I don't do something stupid this late at night. :) Off to bed. |
@MokhamedDakhraui and @johannesc I would appreciate your input on this issue as I expect you to have a better grasp of this part of the code than I have.
Exactly, I don't want to break someones setup that has been running for years just because fast-export is updated, having aquired a new feature. [The following depends on my assumption that what's stored in the super Mercurial repo when a sub-repo is updated, is the hash of the sub-repo. If that's not true, just tell me I'm wrong] During conversion we write the |
I think using json as an additional format should work and be backwards compatible. Note that you should always use relative paths to the subrepos, otherwise you will end up with the absolute path/url in the git history. So your example should probably be:
Another option is to use yaml which I find a little easier to hand write. Your example could then look like:
Parsing yaml in python is as easy as parsing json. |
@johannesc Thanks for the feedback. I will have a look at yaml instead. Actually, your comment addresses another burning question I had about the mappings. I realize we're going on a tangent,but I'll try to be quick. I assume relative paths mean I always have to keep a copy of the submodules next to my repo, e.g. in ~/repos/ I have srt_ambiguous, another_subrepo, srt_suba, and srt_subb. Currently we work directly in srt_ambiguous alone without the need for subrepos alongside. Is keeping a copy adjacent to the top level repo a common practice with git submodules or is this a requirement for this type of conversion? The paths couldn't be urls to bitbucket repos since fast-export looks in those actual folders for the hg2git-* files, right? @frej I know I originally thought I'd have to mess with the format of the hg2git-* files, but I did not in my current implementation. I know I thought about it, but I don't remember why I did not. I think the assumption was that the the sha1 would not collide between different branches and so it does not matter so long as the Unfortunately, I have to put this one on pause for another day or so. I am investigating another issue in my converted work repo that has me scratching my head. Looks like we have some commits that change hgsubstate that refer to hashes I don't see in the mercurial repo. Not sure what is happening there. Not sure if related to #187 or not, but I doubt it. I'll keep pressing forward on this one as soon as I figure out what's going on with the head-scratcher. |
@jdanielp When working with fast-export you need a local copy of the subrepos. Once you migrate to git they will be "hidden" inside the .git/modules/ when you clone a repository (if you only checkout a version where the submodule is removed the submodule will not be downloaded). Using relative url means that you can move your repository from bitbucket to github to sourceforge without any trouble as long as you move all the repositories. I hope I made things more clear to you! |
@johannesc You made things much clearer. And you taught me about the existence of .git/modules; I did not know about that. Thank you. @frej I will not be able to work on this again until Monday unfortunately. I need to take Fri-Sun for some personal time. I will be back in earnest on this (#188) issue on Monday. The good news is that I seem to have worked out all the kinks in my validation script and it's currently running (slowly) on the big work repo. That means I should have a way to validate my changes for this issue. I will ask my boss about the possibility of "donating" that backup script to this project. That way we at least have a way to run regression tests on new changes. Regards to all. Sorry for the long posts and not much work. :) |
@frej Sorry to leave you hanging on this. Some things have come up at work that have pulled me off this issue. I've confirmed my fix will convert the repo successfully, but I don't know how future-proof the fix is. At least the good news is that my validation script has verified the conversion with my fix is successful. I hope to be back on this in a week or two. Unfortunately, I cannot be sure of the exact timing due to the other work issues. If you want me to put a patch here for someone else to look at, let me know and I will do that. |
I hope to be back on this in a week or two. Unfortunately, I cannot
be sure of the exact timing due to the other work issues. If you want
me to put a patch here for someone else to look at, let me know and I
will do that.
No worries, let it take the time it takes.
|
Please see attached an example repo and mapping file that cannot be converted with the given submodule mapping technique.
Top level repo is srt_ambiguous. In v1 and v2 branches there is a srt_subx subrepo. But, they have different origins per branch. in branch v1 srt_subx corresponds to srt_suba repo. In v2 branch srt_subx refers to srt_subb repo.
I suspect the mapping file convention will need to be extended to define per-branch mappings. As with issue #187, I am not familiar-enough with the inner workings of hg2git and hg-fast-export(.sh|.py) to hack this myself.
ambiguous.zip
The text was updated successfully, but these errors were encountered: