-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] PoC for a Node.JS backend using nodegit #259
base: master
Are you sure you want to change the base?
Conversation
node/app/routes.ts
Outdated
if (context.path === undefined) { | ||
commits = allCommits; | ||
} else { | ||
/// CAUTION(this can be pretty long) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple way to speed this up is by not fetching the entire diff, but comparing the parents' trees to the commits' trees (without actually looking at the file contents). Looking at .coveragerc
history takes around 30 s for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://github.com/jonashaag/klaus/blob/purepy-hist/klaus/repo.py#L199 for an implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.coveragerc
in which repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and 30s vs. how long with this current implem?)
Yes I think you make a good point. Will have to experiment and re-implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, should've been more specific. I'm talking about huggingface/transformers/commits/master/.coveragerc
. It takes around 30 s, vs time git log .coveragerc
=> 145 ms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Klaus I implemented it simply by calling git log
using subprocess. I used the pure Python implementation linked above but it was very slow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will take a stab at this now (unless you want to do it? :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, go for it... I invested 20 min into coming up with a draft but realised it's going to take me much longer than someone proficient in TypeScript
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't know if what I came up with in Klaus is the best way to do it (but it works)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting SO question here: https://stackoverflow.com/questions/21178383/how-is-gits-log-filename-implemented-internally
|
||
|
||
app.get( '/:repo/commits', historyCommits); | ||
app.get('/:namespace/:repo/commits', historyCommits); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This URL scheme doesn't allow for a repository called "commits" in a namespace, ie. somenamespace/commits
. That's why I decided to use the ~
prefix in Klaus. Maybe there are other ambiguous names, I haven't checked thoroughly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we are kind of constrained in the URLs we have to support (as they already exist) so I'll just hardcode a blacklist of forbidden repo names
|
||
/** | ||
* Note: we only support branch names and tag names | ||
* not containing a `/`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you're coming from but at least in the software engineering world feature branches (feature/xyz
) are very common.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes – for a v1 for the sake of simplicity I think this will work but we can always add support later
Reviewed most of the code and also did some performance tests on my machine. Implementation and performance look very good. libgit2 performance is 1–2 order of magnitudes better than Dulwich performance, apparently. I added a few minor review comments. I don't think there are any obviously bad implementation choices. I suggest that you performance monitor as much as possible in production: Some of the history-based views are unbounded (eg. counting number of commits) and may turn out to be too slow. It probably also makes sense to add caching here and there if you have an easy way to do cache invalidation (with the current refresh mechanism the page is essentially static so you could just cache everything). |
This reverts commit 2688f86.
No description provided.