Skip to content

Latest commit

 

History

History
45 lines (38 loc) · 1.67 KB

RESEARCH.md

File metadata and controls

45 lines (38 loc) · 1.67 KB

Research Questions and their counterparts within the VisualFingerPrinter

Overall questions: What does code look like? What are the basic statistics of code projects across a huge number of repos?

RQ1: What does a code project look like?

  • Total LOC: linecount
  • Number of files: filecount
  • Number of classes: TODO
  • Number of methods: TODO
  • Number of different languages: filecount & linecount
  • Number of GitHub stars: out-of-scope for VFP (API hits are expensive)
  • Number of contributors: TODO
  • Number of commits: TODO
  • Repo age: TODO
  • All of these should be aggregated by language (as designated by the GitHub repo?)

RQ1.5/Maybe: What is the file structure of a repo?

  • Number of directories: TODO
  • Number of different file extensions: filecount
  • Layout of file structure: TODO needs clarification
  • Max directory depth: TODO
  • All of these should be aggregated by language (as designated by the GitHub repo?)

RQ2: What is the visual shape of code?

  • Length of files: linecount
  • Length of classes: TODO
  • Length of functions: TODO
  • Width of functions: TODO
  • Heatmaps showing the shape?
  • All of these should be aggregated by language (as designated by the GitHub repo?)

RQ3: What is in a line of code?

  • Comments: TODO
  • Stats on frequency and associations between token types
  • Heatmaps showing different types of tokens?

RQ4: What is the correlation between all of these results and various project factors?

  • Relationships between RQ1-3 results and...
  • Number of stars
  • LOC
  • Number of contributors
  • Repo age
  • Number of commits
  • Time since commit