Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs how to organize project #637

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
97c6662
docs: WIP how to organize devlake projects
KucherenkoSerhiy Aug 31, 2023
d400439
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 2, 2023
77889b6
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 2, 2023
509cdc4
docs: expanding and structuring use case 2
KucherenkoSerhiy Sep 2, 2023
801149e
docs: alternative diagram format for the use cases 1 and 2
KucherenkoSerhiy Sep 3, 2023
ee5e1a9
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 5, 2023
0c1263a
docs: remove unused diagrams
KucherenkoSerhiy Sep 5, 2023
a631027
docs: add "how to navigate to dora", and "observe metrics by project"…
KucherenkoSerhiy Sep 5, 2023
474843f
docs: rearrange section markdown
KucherenkoSerhiy Sep 12, 2023
b145166
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 12, 2023
6d0367a
docs: rename doc file
KucherenkoSerhiy Sep 12, 2023
f575881
docs: update project names on screenshots
KucherenkoSerhiy Sep 12, 2023
88a82c3
docs: webhook notes
KucherenkoSerhiy Sep 12, 2023
0866930
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 12, 2023
df15180
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 16, 2023
589a9cf
docs: use case 2
KucherenkoSerhiy Sep 16, 2023
e3ed2bb
docs: QA regarding webhooks and connections
KucherenkoSerhiy Sep 16, 2023
30989b1
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 18, 2023
f55658a
Authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 18, 2023
51572c1
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 18, 2023
e600f77
Co-authored-by: Startrekzky <[email protected]>
KucherenkoSerhiy Sep 23, 2023
999edd6
docs: improve writing, formatting
KucherenkoSerhiy Sep 24, 2023
be75333
Merge branch 'main' of https://github.com/apache/incubator-devlake-we…
KucherenkoSerhiy Sep 24, 2023
1c506e1
docs: move the new guide to 'GettingStarted' location
KucherenkoSerhiy Sep 24, 2023
39d51bb
docs: versioning as v0.19
KucherenkoSerhiy Sep 24, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
135 changes: 135 additions & 0 deletions docs/_temp/Installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
---
title: "How to Organize Devlake Projects"
sidebar_position: 1
description: >
How to Organize Devlake Projects
---

## 1. Introduction
A typical team of developers works with `pull requests`, `deployments`, and `incidents` inside boards.

Based on such, we want to measure their productivity and stability. This is how [DORA](docs/DORA.md) does that:
- Productivity:
- How many times does the team `deploy`? (a.k.a. [Deployment Frequency](docs/Metrics/DeploymentFrequency.md))
- How fast are the `pull requests` resolved? (a.k.a. [Lead Time](docs/Metrics/LeadTimeForChanges.md))
- Stability:
- How many `incidents` per `deploys` does the team have? (a.k.a. [Change Failure Rate](docs/Metrics/CFR.md))
- How fast are these `incidents` solved? (a.k.a. [Median Time to Restore](docs/Metrics/MTTR.md))

All these questions/metrics are based on either `pull requests`, `deployments`, or `incidents`.

But when we scale this up, a few problems arise:
- A team usually works with multiple `repositories`
- A team also might work on different projects, and we want to measure these projects separately (e.g. it is not the same to work on a big old legacy than on a greenfield)
- There may be multiple teams
- A `board` contains incidents of multiple teams or projects
- A `repository` is managed by multiple teams or projects, e.g. a monorepo
- A `pipeline` can trigger deployments in multiple repositories
- Some organizations want to measure DORA based on projects, and some want to measure it by teams

This is where the `project` concept comes to play.

## 2. What is a DevLake project?
In the real world, a project is something being built and/or researched to solve some problem or to open new grounds.
In software development, a project is just a grouping of something. In DevLake, a `project` is a grouping of `pull requests`, `deployments`, or `incidents`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make the definition straightforward, something like:

"A DevLake project is a grouping of pull requests, deployments, or incidents. It can be seen as a real-world project or product line. DevLake measures DORA metrics for each project."


![](project_simple.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with a simple version of the project pic, it introduced the idea of domains, i.e. repos, boards and cicd_scopes based on the original pic. What do you think?
project

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks simple and gives a pipeline-oriented view of what is happening. Since DevLake carries much of the pipeline nature, it is accurate.


## 3. As a team lead, how many DevLake projects do I need?

Because of its simplicity, the concept is flexible: you decide how to arrange `pull requests`, `deployments`, and `incidents`
either by your specific projects, by teams, technology, or any other way.

The examples below show the patterns of how to organize your projects.

### 3.1. Use case 1: One `board` and multiple `repos` per team

Imagine a team that develops 2 `projects` with one `board` and multiple `repositories`.
The first `project` consists of 3 `repositories` with one of them worked most of the time
The second `project` only has 2 `repositories` worked equal time among them.
The structure will look like the following:

![](project_use_case_1.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this pic and I can understand it, but I'm worried that it might be a bit complicated for users. I have a few thoughts for your reference:

  1. Make all use cases more real-life. E.g, use 2 Jira boards, 3 GitLab repos, and Deployments are executed via GitLab CI in each GitLab repo to replace the current description and the wordings in the picture.
    image

  2. Convert the single image to several steps according to the Config UI workflow to reduce the cognitive load so that users can follow it step by step. An example for the pic:
    image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is complex. Thank you for pointing out some possible ways to break it down, that is what I was looking for.

I'd rather use a different CI/CD platform to avoid possible missing points based on that you can configure both pull requests and deploys with a single connection in the case of GitLab. Using a single connection for multiple entities is a great feature, but as an example, it hides the point of seeing pull requests and deploys as completely separate entities.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@KucherenkoSerhiy You can use CI/CD platform for sure. What I was thinking was that in some use cases, we can use CI/CD platform: GitLab CI; while in others we can use CI/CD platform: Jenkins or CI/CD platform CircleCI(via webhook). The idea is to make them more close to real life. I hope it makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I re-thought about the annotations here. I think after we introduce the toolchains and how incidents, repos, and deployments are organized in each use case, we can

  1. First, add a text summary describing how many DevLake projects should be created.
  2. Then, show which boards(for incidents), repos(for pull_request), and deployments belong each project.
  3. Then, add a picture with steps showing how to configure the projects in DevLake.

I think this flow will be easier to understand. Looking forward to your feedback.


Note that:
- The same pattern applies for more teams and projects
- If instead there were 2 teams working on 1 project, the structure remains the same (besides renaming the DevLake project)
- It does not matter if a particular repository it touched more than the others. Here is why: [Debugging DORA Issue Metrics](docs/Troubleshooting/Dashboard.md#debugging-dora-issue-metrics)

#### Zooming in data collection
There are a few steps we must define before finally having our metrics.
- First, we create a `connection`
- Then we use that `connection` for out project, defining its `scope`.
In this step we also specify its `transformation` to tell DevLake the format of our data

With this in mind, here is how this example looks like now:
![](use_case_1_alternative.png)

TODO: - screenshots on how that should look on DevLake

### 3.2. Use case 2: Multiple `boards`, shared `repos`

#### Conditions
- 2 teams developing a main app
- Each team uses `X boards` for requirements, but also shares `Y boards` for bugs and incidents.
- Each team maintains `X repos` for main app, but also shares some `Y repos` for libraries
- Each team has their own `deployments` for main app

#### Interpreting

Let's start with translating to the `pull requests`, `deployments`, and `incidents`.
Looking at them one by one, we find out that we have:
- Shared `boards` for `incidents`
- Individual and shared `repos`
- Individual `deployments`

#### Structuring
Since we have only one project but two teams, we should create a DevLake `project`
for each team, to keep the DevLake `projects` atomic.
- Note: every time we split a team or a project, an existing DevLake `project` that reflects
that team should also be split

For `boards` and `incidents` we need a way to split them between teams. DevLake also allows looking at them combined,
on Grafana, so that won't be a problem. To do so, we must create 2 connections (1 for each team)
and specify their scope.

For `repos` we should have 1 connection for individual `repos` per team,
and 1 for a shared set of `repos`, in the total of 3 connections.

All `deployments`, are individual, so getting a connection per team should suffice.

#### General advice

There are 3 red lines when it comes to structuring your DevLake `projects`:
- We must look at `repos`, `board incidents` and `deployments` separately, one by one.
They are **independent** entities first, and only then related to each other.
- Have a DevLake `project` for each `team`, project, or application that you want to study individually
- Every time we have a set of either `repos`, `board incidents` or `deployments`, we should have
a separate connection just for that set so combining them is not a problem


#### Diagram
To put some names, we will have `Team A` and `Team B`.
All the `repos`, `incidents`, and `deployments` will be named with `A` or `B` in the beginning to show to which team they belong.
Shared entities will be named with `AB` in the beginning.

The structure should look like the following:
![](use_case_2_alternative.png)

Extending the case:
- TODO: assume we have a third team

### 3.1. What am I looking for with DORA?
TODO: explain right and wrong ways to use DORA

## 4. How do we organize projects when there is data from multiple connection(s)?
TODO

### 4.1. Webhooks
TODO

## 5. How do I know if the data of a project is successfully collected?
TODO

## 6. How can I observe metrics by project?
TODO
Binary file added docs/_temp/project_simple.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_temp/project_use_case_1.png
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the diagram is much simpler. I only have few thoughts on the wordings:

  1. To make the wording consistent, if Jira Board A is used, we should use GitHub Repo 1, Jenkins Job 1 instead.
  2. To make the example more real, we can add a caption under the Scopes, such as adding "Project A's features" under "Jira Board A", "Project A's main app" under "GitHub Repo1", "Shared libraries" under "Repo2", etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Startrekzky, Thank you once again for the corrections. I am not sure I understand the second point, but I think I get it. Let me know your thoughts in any case!

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_temp/use_case_1_alternative.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/_temp/use_case_2_alternative.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.