Skip to content

Commit

Permalink
Merge pull request #176 from se-passau/dev
Browse files Browse the repository at this point in the history
Version 3.6

Merged-by: Thomas Bock <[email protected]>
  • Loading branch information
bockthom authored Feb 21, 2020
2 parents d02d523 + 75ae4a5 commit 91fc448
Show file tree
Hide file tree
Showing 12 changed files with 620 additions and 128 deletions.
26 changes: 11 additions & 15 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,36 +11,32 @@
## with this program; if not, write to the Free Software Foundation, Inc.,
## 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
##
## Copyright 2017-2018 by Claus Hunsen <[email protected]>
## Copyright 2017-2018,2020 by Claus Hunsen <[email protected]>
## All Rights Reserved.

# TravisCI container
os: linux
dist: xenial
warnings_are_errors: false

# R environment, dependencies and information
language: r
r:
- 3.3
- 3.4
- 3.5

# TravisCI container
sudo: required
dist: trusty
warnings_are_errors: false

# # Branches
# branches:
# only:
# - travis
# - claus-updates

# R dependencies and information
- 3.6
cache: packages
repos:
CRAN: https://cloud.r-project.org

# installation
# Installation
install:
# package dependencies
- sudo apt-get install libudunits2-dev
# package installation
- Rscript install.R

# Tests
script:
- Rscript tests.R
16 changes: 16 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,21 @@
# coronet – Changelog

## 3.6

### Added
- Add a parameter `editor.definition` to the function `add.vertex.attribute.artifact.editor.count` which can be used to define, if author or committer or both count as editors when computing the attribute values. (#92, ff1e147ba563b2d71f8228afd49492a315a5ad48)
- Add the possibility to filter out patchstack mails from the mails of the `ProjectData`. The option can be toggled using the newly added configuration option `mails.filter.patchstack.mails`. (1608e28ca36610c58d2a5447d12ee2052c6eb976, a932c8cdaa6fe5149c798bc09d9e421ba679c48d)
- Add a new file `util-plot-evaluation.R` containing functions to plot commit edit types per author and project. (PR #171, d4af515f859ce16ffaa0963d6d3d4086bcbb7377, aa542a215f59bc3ed869cfefbc5a25fa050b1fc9. 0a0a5903e7c609dfe805a3471749eb2241efafe2)

### Changed/Improved

- Add R version 3.6 to test suite (8b2a52d38475a59c55feb17bb54ed12b9252a937, #161)
- Update `.travis.yml` to improve compatibility with Travis CI (41ce589b3b50fd581a10e6af33ac6b1bbea63bb8)

### Fixed

- Ensure sorting of commit-count and LOC-count data.frames to fix tests with R 3.3 (33d63fd50c4b29d45a9ca586c383650f7d29efd5)


## 3.5

Expand Down
10 changes: 8 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ While `proximity` triggers a file/function-based commit analysis in `Codeface`,
When using this network library, the user only needs to give the `artifact` parameter to the [`ProjectConf`](#projectconf) constructor, which automatically ensures that the correct tagging is selected.

The configuration files `{project-name}_{tagging}.conf` are mandatory and contain some basic configuration regarding a performed `Codeface` analysis (e.g., project name, name of the corresponding repository, name of the mailing list, etc.).
For further details on those files, please have a look at some [example files](https://github.com/siemens/codeface/tree/master/conf) files in the `Codeface` repository.
For further details on those files, please have a look at some [example files](https://github.com/siemens/codeface/tree/master/conf) in the `Codeface` repository.

All the `*.list` files listed above are output files of `codeface-extraction` and contain meta data of, e.g., commits or e-mails to the mailing list, etc., in CSV format.
This network library lazily loads and processes these files when needed.
Expand Down Expand Up @@ -133,7 +133,7 @@ Alternatively, you can run `Rscript install.R` to install the packages.

Please insert the project into yours by use of [git submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules).
Furthermore, the file `install.R` installs all needed R packages (see [below](#needed-r-packages)) into your R library.
Although, the use of of [packrat](https://rstudio.github.io/packrat/) with your project is recommended.
Although, the use of [packrat](https://rstudio.github.io/packrat/) with your project is recommended.

This library is written in a way to not interfere with the loading order of your project's `R` packages (i.e., `library()` calls), so that the library does not lead to masked definitions.

Expand Down Expand Up @@ -415,6 +415,8 @@ Additionally, for more examples, the file `showcase.R` is worth a look.
* Functionality for the identification of network motifs (subgraph patterns)
- `util-plot.R`
* Everything needed for plotting networks
- `util-plot-evaluation.R`
* Plotting functions for data evaluation
- `util-misc.R`
* Helper functions and also legacy functions, both needed in the other files
- `showcase.R`
Expand Down Expand Up @@ -521,6 +523,10 @@ There is no way to update the entries, except for the revision-based parameters.
- `commits.filter.untracked.files`
* Remove all information concerning untracked files from the commit data. This effect becomes clear when retrieving commits using `get.commits.filtered`, because then the result of which does not contain any commits that solely changed untracked files. Networks built on top of this `ProjectData` do also not contain any information about untracked files.
* [*`TRUE`*, `FALSE`]
- `mails.filter.patchstack.mails`
* Filter patchstack mails from the mail data. In a thread, a patchstack spans the first sequence of mails where each mail has been authored by the thread creator and has been sent within a short time window after the preceding mail. The mails spanned by a patchstack are called
'patchstack mails' and for each patchstack, every patchstack mail but the first one are filtered when `mails.filter.patchstack.mails = TRUE`.
* [`TRUE`, *`FALSE`*]
- `issues.only.comments`
* Only use comments from the issue data on disk and no further events such as references and label changes
* [*`TRUE`*, `FALSE`]
Expand Down
8 changes: 8 additions & 0 deletions showcase.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
## Copyright 2017 by Felix Prasse <[email protected]>
## Copyright 2017-2018 by Thomas Bock <[email protected]>
## Copyright 2018 by Jakob Kronawitter <[email protected]>
## Copyright 2019 by Klara Schlueter <[email protected]>
## All Rights Reserved.


Expand Down Expand Up @@ -80,6 +81,13 @@ revisions.callgraph = proj.conf$get.value("revisions.callgraph")
x.data = ProjectData$new(project.conf = proj.conf)
x = NetworkBuilder$new(project.data = x.data, network.conf = net.conf)

## * Evaluation plots ------------------------------------------------------

# edit.types = plot.commit.edit.types.in.project(x.data)
# edit.types.scaled = plot.commit.edit.types.in.project(x.data, TRUE)
# editor.types = plot.commit.editor.types.by.author(x.data)
# editor.types.scaled = plot.commit.editor.types.by.author(x.data, TRUE)

## * Data retrieval --------------------------------------------------------

# x.data$get.commits()
Expand Down
66 changes: 61 additions & 5 deletions tests/test-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,8 @@
## 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
##
## Copyright 2018 by Christian Hechtl <[email protected]>
## Copyright 2018 by Claus Hunsen <[email protected]>
## Copyright 2018-2019 by Claus Hunsen <[email protected]>
## Copyright 2019 by Jakob Kronawitter <[email protected]>
## All Rights Reserved.


Expand All @@ -34,6 +35,7 @@ test_that("Compare two ProjectData objects", {

##initialize a ProjectData object with the ProjectConf and clone it into another one
proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
proj.conf$update.value("pasta", TRUE)
proj.data.one = ProjectData$new(project.conf = proj.conf)
proj.data.two = proj.data.one$clone()

Expand All @@ -43,19 +45,20 @@ test_that("Compare two ProjectData objects", {
## second object, as well, and test for equality.

##change the second data object
proj.data.one$get.commits()

proj.data.two$get.pasta()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.two$get.commits()
proj.data.one$get.pasta()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")

proj.data.two$get.pasta()
proj.data.one$get.commits()

expect_false(proj.data.one$equals(proj.data.two), "Two not identical ProjectData objects.")

proj.data.one$get.pasta()
proj.data.two$get.commits()

expect_true(proj.data.one$equals(proj.data.two), "Two identical ProjectData objects.")

Expand Down Expand Up @@ -123,3 +126,56 @@ test_that("Compare two RangeData objects", {
expect_false(proj.data.base$equals(range.data.four))

})

test_that("Filter patchstack mails", {

proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
proj.conf$update.value("mails.filter.patchstack.mails", TRUE)

## create the project data
proj.data = ProjectData$new(proj.conf)

## retrieve the mails while filtering patchstack mails
mails.filtered = proj.data$get.mails()

## create new project with filtering disabled
proj.conf$update.value("mails.filter.patchstack.mails", FALSE)
proj.data = ProjectData$new(proj.conf)

## retrieve the mails without filtering patchstack mails
mails.unfiltered = proj.data$get.mails()

## get message ids
mails.filtered.mids = mails.filtered[["message.id"]]
mails.unfiltered.mids = mails.unfiltered[["message.id"]]

expect_equal(setdiff(mails.unfiltered.mids, mails.filtered.mids), c("<[email protected]>",
"<[email protected]>",
"<[email protected]>",
"<[email protected]>",
"<[email protected]>"))
})

test_that("Filter patchstack mails with PaStA enabled", {
proj.conf = ProjectConf$new(CF.DATA, CF.SELECTION.PROCESS, CASESTUDY, ARTIFACT)
proj.conf$update.value("mails.filter.patchstack.mails", TRUE)
proj.conf$update.value("pasta", TRUE)

proj.data = ProjectData$new(proj.conf)

## retrieve filtered PaStA data by calling 'get.pasta' which calls the filtering functionality internally
filtered.pasta = proj.data$get.pasta()

## ensure that the remaining mails have not been touched
expect_true("<[email protected]>" %in% filtered.pasta[["message.id"]])
expect_true("<[email protected]>" %in% filtered.pasta[["message.id"]])
expect_true("<[email protected]>" %in% filtered.pasta[["message.id"]])
expect_equal(2, sum(filtered.pasta[["message.id"]] == "<[email protected]>"))

## ensure that the three PaStA entries relating to the filtered patchstack mails have been merged to a single new
## PaStA entry which has assigned the message ID of the first patchstack mail
expect_true("<[email protected]>" %in% filtered.pasta[["message.id"]])

## ensure that there are no other entries than the ones that have been verified to exist above
expect_equal(6, nrow(filtered.pasta))
})
50 changes: 44 additions & 6 deletions tests/test-networks-covariates.R
Original file line number Diff line number Diff line change
Expand Up @@ -818,9 +818,7 @@ test_that("Test add.vertex.attribute.artifact.editor.count", {

networks.and.data = get.network.covariates.test.networks("artifact")

expected.attributes = network.covariates.test.build.expected(list(1L), list(1L), list(3L, 1L))

expected.attributes = list(
expected.attributes.author = list(
range = network.covariates.test.build.expected(
c(1L), c(1L), c(3L, 1L)),
cumulative = network.covariates.test.build.expected(
Expand All @@ -834,18 +832,58 @@ test_that("Test add.vertex.attribute.artifact.editor.count", {
complete = network.covariates.test.build.expected(
c(2L), c(2L), c(3L, 1L))
)
expected.attributes.committer = list(
range = network.covariates.test.build.expected(
c(1L), c(1L), c(2L, 1L)),
cumulative = network.covariates.test.build.expected(
c(1L), c(1L), c(2L, 1L)),
all.ranges = network.covariates.test.build.expected(
c(1L), c(1L), c(2L, 1L)),
project.cumulative = network.covariates.test.build.expected(
c(1L), c(1L), c(2L, 1L)),
project.all.ranges = network.covariates.test.build.expected(
c(1L), c(1L), c(2L, 1L)),
complete = network.covariates.test.build.expected(
c(1L), c(1L), c(2L, 1L))
)
expected.attributes.both = list(
range = network.covariates.test.build.expected(
c(1L), c(2L), c(3L, 1L)),
cumulative = network.covariates.test.build.expected(
c(1L), c(2L), c(3L, 1L)),
all.ranges = network.covariates.test.build.expected(
c(2L), c(2L), c(3L, 1L)),
project.cumulative = network.covariates.test.build.expected(
c(1L), c(2L), c(3L, 1L)),
project.all.ranges = network.covariates.test.build.expected(
c(2L), c(2L), c(3L, 1L)),
complete = network.covariates.test.build.expected(
c(2L), c(2L), c(3L, 1L))
)

## Test

lapply(AGGREGATION.LEVELS, function(level) {
networks.with.attr = add.vertex.attribute.artifact.editor.count(
networks.with.attr.author = add.vertex.attribute.artifact.editor.count(
networks.and.data[["networks"]], networks.and.data[["project.data"]],
aggregation.level = level
)
networks.with.attr.committer = add.vertex.attribute.artifact.editor.count(
networks.and.data[["networks"]], networks.and.data[["project.data"]],
aggregation.level = level, editor.definition = "committer"
)
networks.with.attr.both = add.vertex.attribute.artifact.editor.count(
networks.and.data[["networks"]], networks.and.data[["project.data"]],
aggregation.level = level, editor.definition = c("author", "committer")
)

actual.attributes = lapply(networks.with.attr, igraph::get.vertex.attribute, name = "editor.count")
actual.attributes.author = lapply(networks.with.attr.author, igraph::get.vertex.attribute, name = "editor.count")
actual.attributes.committer = lapply(networks.with.attr.committer, igraph::get.vertex.attribute, name = "editor.count")
actual.attributes.both = lapply(networks.with.attr.both, igraph::get.vertex.attribute, name = "editor.count")

expect_equal(expected.attributes[[level]], actual.attributes)
expect_equal(expected.attributes.author[[level]], actual.attributes.author)
expect_equal(expected.attributes.committer[[level]], actual.attributes.committer)
expect_equal(expected.attributes.both[[level]], actual.attributes.both)
})
})

Expand Down
6 changes: 6 additions & 0 deletions util-conf.R
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,12 @@ ProjectConf = R6::R6Class("ProjectConf", inherit = Conf,
allowed = c(TRUE, FALSE),
allowed.number = 1
),
mails.filter.patchstack.mails = list(
default = FALSE,
type = "logical",
allowed = c(TRUE, FALSE),
allowed.number = 1
),
synchronicity = list(
default = FALSE,
type = "logical",
Expand Down
14 changes: 7 additions & 7 deletions util-core-peripheral.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
## Copyright 2017 by Mitchell Joblin <[email protected]>
## Copyright 2017 by Ferdinand Frank <[email protected]>
## Copyright 2017 by Sofie Kemper <[email protected]>
## Copyright 2017-2019 by Claus Hunsen <[email protected]>
## Copyright 2017-2020 by Claus Hunsen <[email protected]>
## Copyright 2017 by Felix Prasse <[email protected]>
## Copyright 2018-2019 by Christian Hechtl <[email protected]>
## Copyright 2018 by Klara Schlüter <[email protected]>
Expand Down Expand Up @@ -637,7 +637,7 @@ get.committer.not.author.commit.count = function(range.data) {
res = sqldf::sqldf("SELECT *, COUNT(*) AS `freq` FROM `commits.df`
WHERE `committer.name` <> `author.name`
GROUP BY `committer.name`, `author.name`
ORDER BY `freq` DESC")
ORDER BY `freq` DESC, `author.name` ASC")

logging::logdebug("get.committer.not.author.commit.count: finished.")
return(res)
Expand All @@ -664,7 +664,7 @@ get.committer.and.author.commit.count = function(range.data) {
res = sqldf::sqldf("SELECT *, COUNT(*) AS `freq` FROM `commits.df`
WHERE `committer.name` = `author.name`
GROUP BY `committer.name`, `author.name`
ORDER BY `freq` DESC")
ORDER BY `freq` DESC, `author.name` ASC")

logging::logdebug("get.committer.and.author.commit.count: finished.")
return(res)
Expand Down Expand Up @@ -699,7 +699,7 @@ get.committer.or.author.commit.count = function(range.data) {

res = sqldf::sqldf("SELECT *, COUNT(*) AS `freq` FROM `ungrouped`
GROUP BY `name`
ORDER BY `freq` DESC")
ORDER BY `freq` DESC, `name` ASC")

logging::logdebug("get.committer.or.author.commit.count: finished.")
return(res)
Expand All @@ -725,7 +725,7 @@ get.committer.commit.count = function(range.data) {

## Execute a query to get the commit count per author
res = sqldf::sqldf("SELECT *, COUNT(*) AS `freq` FROM `commits.df`
GROUP BY `committer.name` ORDER BY `freq` DESC")
GROUP BY `committer.name` ORDER BY `freq` DESC, `committer.name` ASC")

logging::logdebug("get.committer.commit.count: finished.")
return(res)
Expand All @@ -751,7 +751,7 @@ get.author.commit.count = function(proj.data) {

## Execute a query to get the commit count per author
res = sqldf::sqldf("SELECT `author.name`, COUNT(*) AS `freq` FROM `commits.df`
GROUP BY `author.name` ORDER BY `freq` DESC")
GROUP BY `author.name` ORDER BY `freq` DESC, `author.name` ASC")

logging::logdebug("get.author.commit.count: finished.")
return(res)
Expand Down Expand Up @@ -813,7 +813,7 @@ get.author.loc.count = function(proj.data) {
## Execute a query to get the changed lines per author
res = sqldf::sqldf("SELECT `author.name`, SUM(`added.lines`) + SUM(`deleted.lines`) AS `loc`
FROM `commits.df`
GROUP BY `author.name` ORDER BY `loc` DESC")
GROUP BY `author.name` ORDER BY `loc` DESC, `author.name` ASC")

logging::logdebug("get.author.loc.count: finished.")
return(res)
Expand Down
Loading

0 comments on commit 91fc448

Please sign in to comment.