Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release / activity stats to identify collections which need to be released / supported / provided with new maintainers #25

Open
Andersson007 opened this issue Mar 17, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@Andersson007
Copy link

Andersson007 commented Mar 17, 2021

It's a living issue. Goals / metrics can be changed at any time.
Relates to #23

Goal

To track releases / activity / find new maintainers in / for the collections, revoke privileges from inactive maintainers.

Issue

at the moment, we have 80+ collections under ansible-collections. Before the splitting, rarely but surely, users got merged fixes / new features shipped. Now, we should monitor the collections and prevent the situations when it has merged stuff but not released or it has stuff to merge but no active committers maintainers.

Several possible scenarios
  1. A collection gets things merged regularly and releases them regularly - so, everything is OK, the collection is fully maintained ("fully" means that there are active committers and they release the collection).
    What we should do: keep tracking the activity there

  2. A collection gets things merged regularly but there have been no releases for a long time.
    What we should do:
    a) check if there's a release policy in the collection
    b) ask the committers why they don't release the collection - no time or they need to be trained
    c) release the collection ourselves / train the committers how to conduct releases

  3. A collection doesn't get things merged, no releases, but there are new PRs submitted since the latest release (at least, it's 1.0.0).
    What we should do:
    a) conduct a PR day
    b) release ourselves
    c) find maintainers from the community (first of all, from active contributors)
    d) revoke privileges from inactive maintainers

  4. A collection doesn't get things merged, no releases, no PRs submitted.
    What we should do:
    a) if there are maintainers, ask them if the content of the collection is still relevant for users
    b) we could also use a number of monthly open issues as a measure ^
    c) if relevant, maybe the collection is new and nobody knows about it. Should we announce it using possible ways?
    d) if irrelevant (e.g. the underlying service is dead), should we do nothing?

What would be helpful to see

Dashboard(s) with:

  • Graph per collection: x - timeline (months), y - number of:
    • PRs open per month
    • PRs merged per month (or commits which can be fetched from GH API)
    • tags / releases per month
    • number of open issues per month
  • Table containing commit / PR authors to identify new potential maintainers:
    • name
    • counter
    • last commit timestamp
  • Table containing commit / PR committers to see current committers / most active committers
    • name
    • counter
    • last commit timestamp
What would help see problem spots
  • Alerts in the dashboard like red banners when a collection reaches some threshold like a time delta between last release / tag and now, etc.
  • Email notifications when the thresholds are reached.
What METRICS to collect / calculate

(with fields described above and all per month)

  • PRs open
  • PRs / commits merged
  • Issue number open
  • Tags / releases
  • PR / commit authors
  • PR / commit committers
  • time delta between last release / tag and commit
  • time delta between last release / tag and now
  • number of commits merged between last release / tag and now
  • number of open issues / PRs between last release / tag and now
  • something else
@gundalow gundalow added the enhancement New feature or request label Mar 17, 2021
@GregSutcliffe
Copy link
Contributor

Thanks @Andersson007, this is good stuff. I especially like seeing the thinking around how you'll use the data, and what scenarios it might enable you to detect - this helps me work out some visualization.

In terms of the data, we already have most of this. Issues & PRs are indexed daily by ansible-community/stats-crawler, and we can extend that where needed. We don't have tags, but they are a single GH API query per repo, which is light (a single authenticated GH key can make 5000 requests per hour). So this seems easily achievable.

Much of the graphs you want are already visible at https://stats.eng.ansible.com/app/collections_dash although I accept the presentation could be improved. In particular, releases is currently derived from Galaxy, and shown on a separate tab. Merging this into a single graph is probably helpful (such as marking the releases as vertical lines on a plot of issues/PRs). I'd love to hear what you'd like improved there - I'll also work on a static version of this kind of thing for our teams weekly reports.

Regarding alerts, my feeling is that the threshold might vary widely between collections - some will be super stable, some less so. We'll need to think carefully about this one; in the meantime, perhaps a simple table of time-since-last-release and time-since-last-commit (and perhaps, the difference of the two) would allow us to at least look at an overview of the situation?

Final note, when you say commits are available via the GH API, I assume you mean https://docs.github.com/en/rest/reference/repos#list-commits. I'll have a play with that, we can likely make some light GraphQL calls that just return the last commit for every collection at once...

/cc @gundalow

@Andersson007
Copy link
Author

@GregSutcliffe the written above sounds very good!

Regarding alerts, my feeling is that the threshold might vary widely between collections - some will be super stable, some less so.

We could define, say, 3 month time delta as a default. Then adjust where needed (we should have individual settings per collection for that, tanks for the idea:) ). Anyway, the notification / alerts are not necessary but would be good to have. I could try to implement it myself later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants