Skip to content
Rodrigo Alves Vieira edited this page May 4, 2013 · 2 revisions

Overview

Most of the content in this Wiki page was extracted from the Graduate Work pager of Flávio Juvenal (@fjsj), who was Groundhog's initial developer.

This document can certainly help you understand Groundhog, what things it does and how it works.

In Groundhog, the integration flow is linear - that is - for every project to be analyzed, first the Search module is executed, then Crawler, then CodeHistory and ultimately, Parser. In some case it's also necessary to execute another module: the Extractor. Below is a brief, overview-ish explanation of the responsibilities of each module:

  • Search – browses the forges' web pages or official APIs and fetches the information about the projects for further download by the Crawler module.

  • Crawler – uses the project's information obtained by the Search module and performs the download of the source code of the projects through the forges' pages or its VCSs.

  • CodeHistory – obtains the most recent version of the source code of a project within a date of interest to have its metrics extracted by the Parser module.

  • Parser – analyzes each file in the source code of a project and builds an AST and browses through it to extract metrics.

  • Extractor – in some cases the source code files are compressed and this module performs the extraction of these files in order to make the source code analysis possible.

Other resources

Clone this wiki locally