You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Depdive currently generates metrics under below dimension:
U indicates the higher the better. D indicates the lower the better. B indicates binary, the value is either 0 or 1
All metrics are positive numbers.
Note that, the term better may be debatable in some cases. However, such a binary categorization makes any cumulative representation simpler.
Usage
crates.io downloads U
crates.io dependents U
github stars U
github forks U
github subscribers count U
Activity metrics
Days since last commit D
Days since last time a issue was opened D
number of commits in last six months U
number of contributors in last six months U
Code analysis
Total lines of code D
Total rust lines of code D
Total rust code pulled in through its own dependencies D
Total rust code pulled in through its exclusive dependencies, i.e., dependency only introduced by this package in the whole dep graph D
has custom build script? BD
how many of its deps have custom build script (percentage)? D
Unsafe analysis
forbids unsafe? BU
unsafe expressions? D
unsafe functions? D
unsafe traits? D
unsafe impls? D
unsafe methods? D
how many of its deps uses unsafe? D
total unsafe code pulled in through dependencies? [ summation of exprs +functions + traits +impls + methods ] D
number of open issues labelled bug D
number of open issues labelled security D
How to communicate the metrics?
Below are my proposals:
For D metrics, do an inverse transformation through 1/n+1 for all metrics to have similar direction, i.e., the higher the better. +1 in the denominator prevents divide by zero where n=0
Normalize each metric into [0,1] range
This can help us in amalgamating various metrics into one metric or one visual representation (explained latter).
Normalization will be done based on all the existing direct dependencies. For example, the dep with the highest downloads will have a 1 rating for downloads and others will have a rating relative to that.
Now this method will certainly not work for a project with a single dependency. However, it can be argued that such a use case does not demand a depdive analysis to begin with. Additionally, we can create a mock super_package, that'll have the best possible values for some metrics such as D metrics where we know the best value is 0.
However, another downside is some metrics like downloads may have long-tailed distributions that can be dominated by some crates having a very high download count, e.g., libc. In these cases, we can use the log-scale of the metrics before normalization.
Now to amalgamate all these metrics into some easily digestable high-level format, two options come to mind:
A weighted sum of the metrics: Cons: Determining the weight is a key challenge here. Pros: We can rank the dependencies. Does anybody want such a ranking or not can be a valid question though! Another use case is that for each dimension, we can fix some threshold -- if a crate falls below such a threshold, someone probably should take a look what's happening.
Radar/ Spider chart: As our metrics are distributed over four dimension, a radar chart to visually represent how a dependency is doing in all the four dimensions can be useful. We'll probably have to introduce some python tolling for this.
A quick sample from google sheet looks like this: (This chart puts all of the three deps into one, but I was thinking of having an individual chart for each deps so that a quick look is sufficient to tell what dimension a crate may be lacking in)
When to generate such a reporting?
A weekly run: Use case here is two: i) having an updated overall dep report each week. ii) if some crate is falling below threshold(!) in some dimension, highlight them for developers to deicide if a review is needed.
Each a time a dependency is added: We can post a comment on PR generating this statistic to help decide if a dependency is welcome or not!
Depdive currently generates metrics under below dimension:
U
indicates the higher the better.D
indicates the lower the better.B
indicates binary, the value is either 0 or 1All metrics are positive numbers.
Note that, the term
better
may be debatable in some cases. However, such a binary categorization makes any cumulative representation simpler.Usage
U
U
U
U
U
Activity metrics
D
D
U
U
Code analysis
D
D
D
D
B
D
D
Unsafe analysis
B
U
D
D
D
D
D
D
D
D
D
How to communicate the metrics?
Below are my proposals:
D
metrics, do an inverse transformation through1/n+1
for all metrics to have similar direction, i.e., the higher the better.+1
in the denominator prevents divide by zero wheren=0
This can help us in amalgamating various metrics into one metric or one visual representation (explained latter).
Normalization will be done based on all the existing direct dependencies. For example, the dep with the highest downloads will have a 1 rating for downloads and others will have a rating relative to that.
Now this method will certainly not work for a project with a single dependency. However, it can be argued that such a use case does not demand a depdive analysis to begin with. Additionally, we can create a mock
super_package
, that'll have the best possible values for some metrics such asD
metrics where we know the best value is 0.However, another downside is some metrics like
downloads
may have long-tailed distributions that can be dominated by some crates having a very high download count, e.g.,libc
. In these cases, we can use thelog-scale
of the metrics before normalization.Now to amalgamate all these metrics into some easily digestable high-level format, two options come to mind:
A quick sample from google sheet looks like this: (This chart puts all of the three deps into one, but I was thinking of having an individual chart for each deps so that a quick look is sufficient to tell what dimension a crate may be lacking in)
When to generate such a reporting?
cc @bmwill , @metajack, @xvschneider
The text was updated successfully, but these errors were encountered: