Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Sparse checkout for repository subdirectories #2

Open
GoogleCodeExporter opened this issue May 10, 2015 · 9 comments
Open

Sparse checkout for repository subdirectories #2

GoogleCodeExporter opened this issue May 10, 2015 · 9 comments

Comments

@GoogleCodeExporter
Copy link

Sometimes there may be a very large repository in HG & Git in which one wants 
to open source only one certain directory, such as a sub-project or particular 
library.

As far as I can tell, currently the only way to do this is to write a bunch of 
editors which delete all folders that aren't being open sourced. This is far 
from optimal, as fetching the other folders could be expensive operation, and 
writing scrubbers for very complicated (dozens of folders) repositories for 
every open source project is error prone. It would be easier to limit a 
repository to a certain path.


Enhancement: Sparse checkout

Include an optional "path" field in the repository JSON object in the config 
file:
"internal": {
  "type": "hg",
  "url":  "file:///home/mbethencourt/work/test_repos/hg_0",
  "project_space": "internal",
  "path": "subproject/libraryname"
},

* Without a "path", MOE falls back to old behavior (operating on entire repo).
* With a path, MOE tries to use "sparse" checkout feature. A "renamer" 
translator will still probably be used to make the layout match external layout.

Sparse checkouts would be implemented a little differently for each client:
* Git would use the sparse checkout feature ( 
http://blog.quilitz.de/2010/03/checkout-sub-directories-in-git-sparse-checkouts/
 )
* SVN would use either "--non-recursive" or "--depth empty", and then "update" 
or "--depth infinite" the specified path
* HG, unfortunately, does not seem to have a sparse checkout feature. It can 
still be imitated however by cobbling together a few commands.

Original issue reported on code.google.com by [email protected] on 23 Aug 2011 at 9:15

@GoogleCodeExporter
Copy link
Author

If you want a codebase to be everything in one particular directory, you can 
use the special 'file' codebase expression. For example, if you ran:

Moe create_codebase --config_file config.txt --codebase 
'file(path="/home/usr/project/")'

The codebase would contain everything in the directory specified by path. 
Likewise, by running:

Moe change --config_file config.txt --codebase 
'file(path="/foo/bar",projectspace="internal")>public' --destination 'destrepo'

you could translate the files in /foo/bar to the public space and the results 
would be put into a writer for destrepo for committal. 

This approach is limited though, in that you can only specify one path.

Slightly more information on the 'file' codebase expression is on the wiki page 
for Codebase Expressions. 

Original comment by [email protected] on 23 Aug 2011 at 10:17

@GoogleCodeExporter
Copy link
Author

Sam, I might be misunderstanding the file codebase expression, but the issue 
I'm talking about is that it it will still clone the entire repository, even if 
it's just one folder that is being exported. For SVN, this isn't an issue: you 
can always specify a precise path in the repo's URL and it won't checkout more 
than needed. But with very large Git and HG repositories, it will clone a lot 
more than necessary.  I don't think the file codebase expression fixes that, 
right?

Original comment by [email protected] on 23 Aug 2011 at 11:27

@GoogleCodeExporter
Copy link
Author

Ah yes, you're right. Right now, the entire git/hg repo is cloned and there's 
no way to clone only the parts you specify.

In the meantime, you could get a local clone of the large git/hg repo and then 
use a file codebase expression that points to some subdir in the clone and 
treat that as if it were a selectively cloned git/hg repo.

Original comment by [email protected] on 24 Aug 2011 at 7:03

@GoogleCodeExporter
Copy link
Author

When using the file(path="...") approach, is anything from the config file 
honored? Specifically, I would like to be able to exclude some files in repo 
(via ignore_file_res), but it doesn't seem to be working. Do I need to add some 
sort of file/path entry under "repositories" in my config file?

Original comment by [email protected] on 26 Mar 2015 at 10:48

@GoogleCodeExporter
Copy link
Author

No there isnt

Original comment by [email protected] on 26 Mar 2015 at 10:53

@GoogleCodeExporter
Copy link
Author

Hmm, actually when I add `projectspace="internal"` to the `file()` codebase 
expression, things seem to improve, but it complains that not everything is 
mapped in my renamer editor, even though the unmapped things are listed in 
"repositories" -> "internal".

Original comment by [email protected] on 26 Mar 2015 at 10:57

@GoogleCodeExporter
Copy link
Author

what should the hard drive name be under

Original comment by [email protected] on 26 Mar 2015 at 11:02

cgruber referenced this issue in cgruber/MOE Feb 18, 2019
Use proper substitution segments for autofactory and autovalue as well.
cgruber referenced this issue in geekinasuit/moe Jun 17, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants
@GoogleCodeExporter and others