Github Mining Tools

Important (April 13, 2012): Small changes have been made to the packaging of the code submitted along with the FSE2012 paper. Please see artifacts.txt for more information.

Almost as important: The rest of this document is out of date.

Getting Going

This project uses Apache Maven to manage all dependencies and versioning. The simplest way to get going is to run the following command: mvn clean compile package assembly:single

This will clean the source tree, compile the code, run the tests, package the code into a jar file, and finally copy all the libraries to a location that makes some modicum of sense. Then to run the mining scripts just run: ./github.sh

Configuration

Project configuration is controlled through the configuration.properties file in src/main/resources. Don't commit it with your github username and apitoken, that would be bad.

Actually, right now it doesn't actually use those fields and it probably won't anytime in the future. So don't worry so much about that.

Additional important configuration parameters

Various elements of the miner can be turned on and off by changing the values of their field to anything other than true. Those fields are:

net.wagstrom.research.github.miner.issues
net.wagstrom.research.github.miner.gists
net.wagstrom.research.github.miner.repositories
net.wagstrom.research.github.miner.organizations
net.wagstrom.research.github.miner.users

Explanation of fields

Every Vertex in the database should have the following:

type: one of USER, REPOSITORY
created_at: ISO 8601 formatted date of when the node was created

Every Edge in the database should have the followning fields:

label: not really a field, but always present
created_at: ISO 8601 formatted date of when the edge was created

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
docs		docs
src		src
.gitignore		.gitignore
README.md		README.md
artifacts.md		artifacts.md
github.sh		github.sh
gremlin.sh		gremlin.sh
pom.xml		pom.xml
repository_loader.sh		repository_loader.sh
ruleset.xml		ruleset.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Github Mining Tools

Getting Going

Configuration

Additional important configuration parameters

Explanation of fields

User Vertex Fields

Repository Vertex Fields

About

Uh oh!

Releases

Packages

graphhub-east/gitminer

Folders and files

Latest commit

History

Repository files navigation

Github Mining Tools

Getting Going

Configuration

Additional important configuration parameters

Explanation of fields

User Vertex Fields

Repository Vertex Fields

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages