Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AG-35] Detect github repositories from github.com and github.io homepages #20

Open
felipead opened this issue Jan 10, 2018 · 0 comments

Comments

@felipead
Copy link
Contributor

In order to improve license detection, it is imperative that we find a package's GitHub repository, since there's a very high change the license is going to be declared there.

We must parse github.com and github.io homepages, infering the repository url from these urls:

https://foo.github.com/bar => https://github.com/foo/bar
https://foo.github.io/bar => https://github.com/foo/bar

Here's an example. This Node.js package declares the homepage as:

  "homepage": "http://zaach.github.com/jsonlint/"

We can safely infer the associated GitHub repository, which is http://github.com/zaach/jsonlint. Inside this repository, the license can be inferred from parsing the README.md file, using the method described in #10.

This is very easy to do, as it is just a matter of adding another regular expression inside app/github/repository.py.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant