From b49d7fc09fef70b63f692904593c612a6b41d1e9 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 24 Nov 2022 12:36:29 +0400 Subject: [PATCH 01/12] Write the monorepo RFC --- rfcs/20221124-monorepo.md | 178 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 178 insertions(+) create mode 100644 rfcs/20221124-monorepo.md diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md new file mode 100644 index 00000000000..8e61607365f --- /dev/null +++ b/rfcs/20221124-monorepo.md @@ -0,0 +1,178 @@ +↖️ Table of Contents + +# RFC: Monorepo + +**Status:** 🚧 WIP, comments are welcome nonetheless + +## Reviewers + +- [ ] <your name here> +- [ ] <your name here> + +## Rationale + +For a comprehensive discussion about the pros, the cons and the counterpoints to each see [discussion](https://github.com/WordPress/openverse/issues/192). This is not the purpose of this RFC. + +This RFC summarily lists the benefits and then, with the twin assumptions of a monorepo being ultimately beneficial and the decision to migrate being finalised in the above discussion, proceeds to go into the implementation details. + +### Exclusive benefits of monorepo + +This only includes things that cannot be accomplished without the use of a monorepo. + +1. Single place to go for issues, PRs and all activity. Currently tickets are scattered across several repos, and any tickets that could benefit more than a single layer must be opened in each of the different repos. + +1. Singular copy (different from synced independent copies) of scaffolding code such as Git hooks, lint rules and common workflows. This is distinctly better than elaborate sync workflows. + +1. Central place for all technical documentation, enabling documentation for different parts of the stack to cross-reference other pieces and stay current with changes in other places. + +1. Enables the infra to deploy the code to coexist with the code itself. Apart from the private secrets that will still need to be encrypted, the IaC files could be organised identical to the code. + +1. Milestones that can span across multiple layers of the stack are only possible in GitHub. This is a limitation imposed by GitHub and there is no workaround for this. + +The [integration](#step-4-integration) section in the latter part of the document describes more interesting outcomes made possible by the monorepo. They may or may not be exclusive to monorepos but they're surely made easier by it. + +## Migration path + +First we will merge the API and the frontend. This decision was made for the following reasons. + +1. API and frontend are tightly linked. The frontend is a direct consumer of what the API produces. + +1. The API and frontend form the "service" side of Openverse that directly faces the users (both API consumers and Search engine users). + +1. The frontend uses ECS deployments and the API is well on the same track. This makes it possible for them to share some deployment code. + +1. To the RFC author, the API and frontend are very familiar so merging them would be easier. Adding a third component would make the task daunting. + +1. Merging incurs a productivity hit for the initial transition. So merging everything in one swoop is not ideal. + +1. The API’s comprehensive tooling for developer documentation can benefit frontend devs and create a unified docs site for contributors. + +1. The API is already organised by stack folders so the `frontend/` directory will fit right in with the others like `api/` and `ingestion_server/`. + +1. The API and frontend share identical tooling for Git hooks, linting and formatting. We will fight our tools less and encounter minimal friction. + + - In fact, we employ a number of hacks to install and configure pre-commit for the frontend. Merging it with the API eliminates the need for such hacks. + +1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server. + +The `WordPress/openverse-api` repo will absorb the `WordPress/openverse-frontend` repo. The `WordPress/openverse-catalog` will also be merged, _later_. + +### Reference + +I'm following the steps listed below in a fork at [@dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse/). You can refer to the fork, but note that it is a comes from a place of haste and has not been treated with the same level of love and care that the final treatment will receive. + +### Step 0: Prerequisites + +#### Get the timing right + +The first step will be to release the frontend, call a code freeze and pause work on it. This is to prevent the frontend repo from continuing to drift as we merge a snapshot of it with the API. + +This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to one fortnight (worst case scenario). + +### Step 1: Merge with histories + +This is a quick process. + +1. Move the entire content of frontend inside a `frontend/` directory, except the following top-level files and folders. Please comment if you can add to this list. + + - `.github/` + - `.editorconfig` + - `justfile` + - `.pre-commit-config.yaml` + - `.prettierignore` (symlink into the `frontend/` directory) + - `.eslintrc.js` (symlink into the `frontend/` directory) + - `.eslintignore` (symlink into the `frontend/` directory) + - `.gitignore` (better to move it into the `frontend/` directory and update some absolute paths) + +1. Create the final commit on the `WordPress/frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here. + +1. Merge this repo's `main` branch into the `WordPress/openverse-api` repo's `main` branch (see Git docs for `--allow-unrelated-histories`). There will be some conflicts but they will be small and infrequent. [[implementation details](#conflict-resolution)] + +1. Create "stack: \*" labels to help with issue and PR management. Spoiler/foreshadowing: these labels will be used for more things later. + +1. Migrate issues from `WordPress/openverse-frontend` to `WordPress/openverse-api`. @obulat's has done prior work in this department (when we migrated from CC Search to Openverse) but that might not be as useful because in this case, we can directly transfer the issues, retaining all their comments. Apply the "stack: frontend" label to them. [[implementation details](#issue-transfer)] + +With this done, we can archive the frontend repo. + +#### Conflict resolution + +The following conflicts may occur during merge. + +- `.prettierignore`: concatenate +- `.pre-commit-config.yaml`: use from [dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse) +- Workflows can conflict but they can be renamed and kept alongside each other, _for now_. + +#### Issue transfer + +As far as I can tell, issue transfer can only be performed via the GitHub GraphQL API ([docs](https://docs.github.com/en/graphql/reference/mutations#transferissue)) and not via the REST API. From my limited testing, transferred issues seem to retain labels (provided they exist in the target repo). + +An implementation of the GraphQL API call (albeit in Ruby) is available in `hub` and the [code for it](https://github.com/github/hub/commit/4c2e44146988dfb385a26f649298f274a5017756) is available in their GitHub repo for reference. + +However, instead of writing the code ourselves, we can install `hub` and run a small script that repeatedly calls `hub` to migrate each issue one by one. That'll be a hack but it's okay since this is a one-off use for this. + +### Step 2. Restore workflows + +The workflows of both the API and frontend will need some refactoring to start, and pass, again. [monopenverse](https://github.com/dhruvkb/monopenverse) has updated these workflows and the following work. + +[monopenverse](https://github.com/dhruvkb/monopenverse) showcases a `setup-env` action that sets up Node.js, Python, Just and other dependencies and can be used in every workflow. + +The `ci_cd.yml` workflow from the API has been very nicely combined with the `ci.yml` workflow from the frontend. Redundant steps were eliminated. + +The following actions have been successfully combined: + +- actionlint ✅ +- bundle_size.yml ✅ +- ci_cd.yml (API) + ci.yml (frontend) [merged] ✅ +- Playwright tests from ci.yml (frontend) ✅ +- draft_release.yml ✅ +- generate_pot.yml ✅ +- gh_pages.yml ✅ +- migration_safety_warning.yml ✅ +- subscribe_to_label.yml ✅ +- label_new_pr.yml ✅ +- pr_closed.yml ✅ +- pr_label_check.yml ✅ +- new_issues.yml ✅ +- pr_ping.yml ✅ + +The following have not been verified to work: + +- renovate.yml +- rollback.yml +- ghcr.yml +- push_docker_image.yml + +With this done, the development on the frontend can continue inside the subdirectory. + +### Step 3. Buff the rough edges + +There will be a few rough edges that I cannot foresee and we can continuously fix those as we spot them. But up to this point we should be in a position where +we can continue to build the API and the frontend independently but from one repo. + +1. The action `banyan/auto-label` will need to be configured (`auto-label.json`) to add the "stack: \*" labels based on the modified directory. + +### Step 4. Integration + +This is the long term combination of code for the frontend and the API. + +#### Combined lint + +All lint steps can be combined in `.pre-commit-config.yaml`. This also simplifies the CI jobs can now be merged. + +See the combined lint in action at [monopenverse](https://github.com/dhruvkb/monopenverse). + +### Step 5. Documentation merge + +The following documentation files will need reorganisation or merge. + +- README.md (both repos) +- CODE_OF_CONDUCT.md (both repos) +- CONTRIBUTING.md (both repos) +- CONTRIBUTORS.md (API only; also why?) +- DOCUMENTATION_GUIDELINES.md (API only) +- TESTING_GUIDELINES.md (frontend only) +- DEPLOYMENT.md (frontend only) + +I will need more information about this because IANAL. + +- LICENSE (both repos) From 3b8caba0556c53506711f8f94a3ad58d8066023d Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 1 Dec 2022 09:47:06 +0400 Subject: [PATCH 02/12] Fix typo in repository name Co-authored-by: sarayourfriend <24264157+sarayourfriend@users.noreply.github.com> --- rfcs/20221124-monorepo.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 8e61607365f..328a2cf5a46 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -84,7 +84,7 @@ This is a quick process. - `.eslintignore` (symlink into the `frontend/` directory) - `.gitignore` (better to move it into the `frontend/` directory and update some absolute paths) -1. Create the final commit on the `WordPress/frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here. +1. Create the final commit on the `WordPress/openverse-frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here. 1. Merge this repo's `main` branch into the `WordPress/openverse-api` repo's `main` branch (see Git docs for `--allow-unrelated-histories`). There will be some conflicts but they will be small and infrequent. [[implementation details](#conflict-resolution)] From f2122d8b523dd7a559aa5840855db5f508fbc7a4 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 1 Dec 2022 09:59:35 +0400 Subject: [PATCH 03/12] Update incorrect info about reorganised files --- rfcs/20221124-monorepo.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 328a2cf5a46..8f50e8d4f10 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -79,9 +79,10 @@ This is a quick process. - `.editorconfig` - `justfile` - `.pre-commit-config.yaml` - - `.prettierignore` (symlink into the `frontend/` directory) - - `.eslintrc.js` (symlink into the `frontend/` directory) - - `.eslintignore` (symlink into the `frontend/` directory) + - `prettier.config.js` + - `.prettierignore` + - `.eslintrc.js` (need to update references in nested `eslintrc.js` files) + - `.eslintignore` - `.gitignore` (better to move it into the `frontend/` directory and update some absolute paths) 1. Create the final commit on the `WordPress/openverse-frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here. From d619394ff98d1cfda69deaaa7dac9b1e9e3579f0 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Fri, 2 Dec 2022 13:57:55 +0400 Subject: [PATCH 04/12] Add required reviewers as suggested by @zackkrida Co-authored-by: Zack Krida --- rfcs/20221124-monorepo.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 8f50e8d4f10..6e75a37e098 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -6,8 +6,8 @@ ## Reviewers -- [ ] <your name here> -- [ ] <your name here> +- [ ] @zackkrida +- [ ] @sarayourfriend ## Rationale From 6ef0168ef9ec9297d3825924f229b2953de9dd65 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Wed, 7 Dec 2022 16:44:36 +0400 Subject: [PATCH 05/12] Expand the benefits of monorepo --- rfcs/20221124-monorepo.md | 44 ++++++++++++++++++++++++++++++++------- 1 file changed, 37 insertions(+), 7 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 6e75a37e098..21b21fc3b57 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -15,19 +15,49 @@ For a comprehensive discussion about the pros, the cons and the counterpoints to This RFC summarily lists the benefits and then, with the twin assumptions of a monorepo being ultimately beneficial and the decision to migrate being finalised in the above discussion, proceeds to go into the implementation details. -### Exclusive benefits of monorepo +### Benefits of monorepo -This only includes things that cannot be accomplished without the use of a monorepo. +1. Single place for everything: -1. Single place to go for issues, PRs and all activity. Currently tickets are scattered across several repos, and any tickets that could benefit more than a single layer must be opened in each of the different repos. + **Current criticism:** We currently have many repos, and issues and PRs spanning all of them. While this makes it easier for us as maintainers to focus our efforts, it's not easy for contributors. Let's say you were a new contributor looking for good-first Python issues. We shouldn't expect them to search in 3 repos `openverse`, `openverse-api` and `openverse-catalog`. We address this by making tools like Overvue or using search terms like this: -1. Singular copy (different from synced independent copies) of scaffolding code such as Git hooks, lint rules and common workflows. This is distinctly better than elaborate sync workflows. + ``` + is:open is:issue repo:WordPress/openverse-catalog repo:WordPress/openverse repo:WordPress/openverse-api repo:WordPress/openverse-frontend repo:WordPress/openverse-infrastructure sort:updated-desc + ``` -1. Central place for all technical documentation, enabling documentation for different parts of the stack to cross-reference other pieces and stay current with changes in other places. + The search term above perfectly illustrates the problem: we forgot about the extension. It's unwieldy and hard to quickly reach and share. -1. Enables the infra to deploy the code to coexist with the code itself. Apart from the private secrets that will still need to be encrypted, the IaC files could be organised identical to the code. + **Monorepo solution:** We could use GitHub's own filters to narrow down what we're looking for. -1. Milestones that can span across multiple layers of the stack are only possible in GitHub. This is a limitation imposed by GitHub and there is no workaround for this. +1. Meta-issues: + + **Current criticism:** If an issue spans more than a single layer of the stack, we need to open a meta issue in `WordPress/openverse`, open sub-issues in each of the different repos, then manually close meta issues after the sub-issues are closed. Same goes for PRs. We make individual PRs for every layer and then have to cross-reference them so that reviewers can see the full picture. Meta issues are good when a work needs to split into subtasks, but they are not good for cross-repo work splitting, especially when the work happens completely outside the knowledge of the meta-issue. + + **Monorepo solution:** A monorepo allows our cross-layer PRs to be viewed more holistically and be reviewed as a complete change. + +1. No more sync: + + Current criticism: We use complex sync workflows to keep files in sync. Some workflows need to by synced to some repos only. Some workflows shouldn't even be in the repo they're synced from. Some files need subtle differences so we compile Jinja templates for them. We also sync GitHub labels and branch management rules. It's a mess (I would know!). + + **Monorepo solution:** A monorepo would just eliminates all of this and saves the time and effort that goes into maintaining these systems. + +1. Unified documentation: + + **Current criticism:** Having many repos, each with its own doc site means two things. Common docs such as contribution process needs to be repeated several times and repo-specific docs get siloed and can only reference each other with external links. Also changing docs in one repo will break any links pointing to it. + + **Monorepo solution:** A better system would be one cohesive doc site, for which the API already has a framework that other repos can just use. + +1. Infra included: + + **Current criticism:** Our deployment workflows have code duplication. Secrets are stored in lots of repos, we keep secrets synced using Terraform. Containers are used in the infra repo but published in their individual repos. + + **Monorepo solution:** Monorepo enables the infra to coexist with the code (albeit in a separate module). Apart from the (encrypted) private secrets, the IaC could be open-sourced similar to the rest of the codebase. Our deployment workflows can share code and deployment secrets. + +1. GitHub Milestones: + + Milestones are confined by repository boundaries. To have milestones that cover issues in different layers of our stack, the only way is for them to be in a monorepo. This is a limitation imposed by GitHub and there is no workaround for this. + +The overarching theme is that there are workarounds for everything. We have been working with split repos quite productively for over a year. My proposition is the the monorepo solutions are better than workarounds. The [integration](#step-4-integration) section in the latter part of the document describes more interesting outcomes made possible by the monorepo. They may or may not be exclusive to monorepos but they're surely made easier by it. From f2a9d8185bc1f6307e1aa37050b76a1659a89516 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 8 Dec 2022 00:57:57 +0400 Subject: [PATCH 06/12] Update implementation to include the `WordPress/openverse` repo --- rfcs/20221124-monorepo.md | 231 +++++++++++++++++++++++++------------- 1 file changed, 151 insertions(+), 80 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 21b21fc3b57..e2e7a722ba1 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -63,7 +63,7 @@ The [integration](#step-4-integration) section in the latter part of the documen ## Migration path -First we will merge the API and the frontend. This decision was made for the following reasons. +First we will merge the API and the frontend repos into `WordPress/openverse`. This decision was made for the following reasons. 1. API and frontend are tightly linked. The frontend is a direct consumer of what the API produces. @@ -71,128 +71,193 @@ First we will merge the API and the frontend. This decision was made for the fol 1. The frontend uses ECS deployments and the API is well on the same track. This makes it possible for them to share some deployment code. -1. To the RFC author, the API and frontend are very familiar so merging them would be easier. Adding a third component would make the task daunting. +1. I am very familiar with the scripts repo, the API and the frontend so merging them would be easier. Adding a third component would make the task daunting. -1. Merging incurs a productivity hit for the initial transition. So merging everything in one swoop is not ideal. +1. Merging incurs a productivity hit for the initial transition. So merging everything in one swoop is not ideal. While we merge these three, effort can be diverted to the catalog. 1. The API’s comprehensive tooling for developer documentation can benefit frontend devs and create a unified docs site for contributors. -1. The API is already organised by stack folders so the `frontend/` directory will fit right in with the others like `api/` and `ingestion_server/`. +1. The merge of two JavaScript codebases provides fertile ground for testing `pnpm` workspaces. + +1. The API is already organised by stack folders so the `frontend/` directory will fit right in with the others like `api/` and `ingestion_server/`. Similarly the scripts repo is nicely organised in folders, reducing conflicts. 1. The API and frontend share identical tooling for Git hooks, linting and formatting. We will fight our tools less and encounter minimal friction. - - In fact, we employ a number of hacks to install and configure pre-commit for the frontend. Merging it with the API eliminates the need for such hacks. + - The frontend's approach for `pre-commit` inspired the RFC for expaning this type of usage to the API as well! -1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server. +1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server as long as we can turn off all network calls and enable 100% reliable output. -The `WordPress/openverse-api` repo will absorb the `WordPress/openverse-frontend` repo. The `WordPress/openverse-catalog` will also be merged, _later_. +The `WordPress/openverse` repo will absorb the `WordPress/openverse-api` and `WordPress/openverse-frontend` repos. The `WordPress/openverse-catalog` will also be merged, _later_. ### Reference -I'm following the steps listed below in a fork at [@dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse/). You can refer to the fork, but note that it is a comes from a place of haste and has not been treated with the same level of love and care that the final treatment will receive. +I'm following the steps listed below in a fork at [@dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse/) [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). You can refer to the fork, but note that it is a comes from a place of haste and has not been treated with the same level of love and care that the final treatment will receive. ### Step 0: Prerequisites #### Get the timing right -The first step will be to release the frontend, call a code freeze and pause work on it. This is to prevent the frontend repo from continuing to drift as we merge a snapshot of it with the API. - -This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to one fortnight (worst case scenario). - -### Step 1: Merge with histories - -This is a quick process. +The first step will be to release the API and frontend, call a code freeze on both of them and pause work on both. This is to prevent the repos from continuing to drift as we merge a snapshot of them into the `WordPress/openverse` repo. -1. Move the entire content of frontend inside a `frontend/` directory, except the following top-level files and folders. Please comment if you can add to this list. +This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to becoming workable again, and another week (for us to iron out any gaps in the docs and references). - - `.github/` - - `.editorconfig` - - `justfile` - - `.pre-commit-config.yaml` - - `prettier.config.js` - - `.prettierignore` - - `.eslintrc.js` (need to update references in nested `eslintrc.js` files) - - `.eslintignore` - - `.gitignore` (better to move it into the `frontend/` directory and update some absolute paths) +Note that in the transition period nothing will break. The old repos will continue to exists as they are, till we ensure everything works and then we archive the current split repos. -1. Create the final commit on the `WordPress/openverse-frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here. +### Step 1: Merge with histories -1. Merge this repo's `main` branch into the `WordPress/openverse-api` repo's `main` branch (see Git docs for `--allow-unrelated-histories`). There will be some conflicts but they will be small and infrequent. [[implementation details](#conflict-resolution)] +This should be quick save for a few merge conflicts. In case of conflict copy the code from [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). + +Add remotes. + +```bash +$ git remote add api ../openverse-api +$ git remote add frontend ../openverse-frontend +``` + +#### Merging API + +```bash +$ git pull api main --allow-unrelated-histories +``` + +| Expected conflicts | Resolution | +| ----------------------- | ----------------------------------------------------------------------------------------------------- | +| .github/CODEOWNERS | Folderwise separation of owners, use from [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/) | +| .prettierignore | Merge entries from all repos, use from [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/) | +| .pre-commit-config.yaml | Python + Node.js linting, use from [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/) | +| justfile | Split into many smaller `justfile`s and one roota | +| .gitignore | Split into many smaller `.gitignore`s and one rootb | +| CONTRIBUTING.md | Write new, referencing doc site | +| README.md | Write new, referencing doc site | + +```bash +$ git add . # Stage files with conflicts resolved +$ git commit -m "Merge branch 'main' of openverse-api" +``` + +**a:** see the following: + +- https://github.com/dhruvkb/monoverse/blob/main/justfile +- https://github.com/dhruvkb/monoverse/blob/main/automations/justfile +- https://github.com/dhruvkb/monoverse/blob/main/api/justfile +- https://github.com/dhruvkb/monoverse/blob/main/ingestion_server/justfile +- https://github.com/dhruvkb/monoverse/blob/main/load_testing/justfile + +**b:** see the following + +- https://github.com/dhruvkb/monoverse/blob/main/.gitignore +- https://github.com/dhruvkb/monoverse/blob/main/api/.gitignore +- https://github.com/dhruvkb/monoverse/blob/main/ingestion_server/.gitignore +- https://github.com/dhruvkb/monoverse/blob/main/ingestion_server/test/.gitignore (unchanged) +- https://github.com/dhruvkb/monoverse/blob/main/load_testing/.gitignore (unchanged) +- https://github.com/dhruvkb/monoverse/blob/main/nginx/.gitignore (unchanged) + +#### Merging frontend + +Change workdir to WordPress/openverse-frontend repo. + +```bash +$ mkdir frontend +``` + +Move everything into it except the following directories and files: + +| File / directory | Reason | +| ----------------------- | -------------------------------------- | +| .git/ | For obvious reasons | +| .github/ | Must be in the root | +| .npmrc | For `pnpm` workspaces | +| .pnpmfile.cjs | For `pnpm` workspaces | +| .editorconfig | Useful across the monorepo | +| .eslintignore | Read by ESLint running in pre-commit | +| .eslintrc.js | Read by ESLint running in pre-commit | +| .prettierignore | Read by Prettier running in pre-commit | +| prettier.config.js | Read by Prettier running in pre-commit | +| .pre-commit-config.yaml | Read by pre-commit | +| LICENSE | Root-level docs | +| README.md | Root-level docs | +| CODE_OF_CONDUCT.md | Root-level docs | +| CONTRIBUTING.md | Root-level docs | + +```bash +$ git add . # Stage all renamed files +$ git commit -m "Nest code under \`frontend/\`" +``` + +Switch to the monorepo. + +```bash +$ git pull frontend main --allow-unrelated-histories +``` + +Since we fixed these files when merging the API, almost all these conflicts are redundant. + +| Expected conflicts | Resolution | +| ----------------------- | ------------------------------- | +| .github/CODEOWNERS | Use existing | +| .prettierignore | Use existing | +| .pre-commit-config.yaml | Uncomment the commented regions | +| .gitignore | Use existing | +| prettier.config.js | Use existing | +| README.md | Use existing | +| CONTRIBUTING.md | Use existing | + +#### Housekeeping 1. Create "stack: \*" labels to help with issue and PR management. Spoiler/foreshadowing: these labels will be used for more things later. -1. Migrate issues from `WordPress/openverse-frontend` to `WordPress/openverse-api`. @obulat's has done prior work in this department (when we migrated from CC Search to Openverse) but that might not be as useful because in this case, we can directly transfer the issues, retaining all their comments. Apply the "stack: frontend" label to them. [[implementation details](#issue-transfer)] - -With this done, we can archive the frontend repo. - -#### Conflict resolution - -The following conflicts may occur during merge. - -- `.prettierignore`: concatenate -- `.pre-commit-config.yaml`: use from [dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse) -- Workflows can conflict but they can be renamed and kept alongside each other, _for now_. - -#### Issue transfer +1. Migrate issues from `WordPress/openverse-frontend` and `WordPress/openverse-api`. We can directly transfer the issues, retaining all their comments. Apply the "stack: frontend" / "stack: backend" label to them after moving. -As far as I can tell, issue transfer can only be performed via the GitHub GraphQL API ([docs](https://docs.github.com/en/graphql/reference/mutations#transferissue)) and not via the REST API. From my limited testing, transferred issues seem to retain labels (provided they exist in the target repo). +```bash +# Substitute repo with WordPress/openverse-frontend and WordPress/openverse-api +$ gh api \ + -X GET \ + repos/WordPress//issues \ + -f pulls=false \ + -f state=all \ + --jq '.[].number' \ + --paginate \ + | xargs \ + -n 1 \ + -I num \ + gh issue transfer \ + num \ + WordPress/openverse \ + -R WordPress/ +``` -An implementation of the GraphQL API call (albeit in Ruby) is available in `hub` and the [code for it](https://github.com/github/hub/commit/4c2e44146988dfb385a26f649298f274a5017756) is available in their GitHub repo for reference. +With this done, we can archive the API and frontend repo. An optional notice may be added to the `README.md` files for clarification before archiving. -However, instead of writing the code ourselves, we can install `hub` and run a small script that repeatedly calls `hub` to migrate each issue one by one. That'll be a hack but it's okay since this is a one-off use for this. +### Step 1. Restore functionality -### Step 2. Restore workflows +#### Combine linting -The workflows of both the API and frontend will need some refactoring to start, and pass, again. [monopenverse](https://github.com/dhruvkb/monopenverse) has updated these workflows and the following work. +All lint steps can be combined in `.pre-commit-config.yaml`. This also simplifies the CI jobs can now be merged. -[monopenverse](https://github.com/dhruvkb/monopenverse) showcases a `setup-env` action that sets up Node.js, Python, Just and other dependencies and can be used in every workflow. +#### `pnpm` workspace -The `ci_cd.yml` workflow from the API has been very nicely combined with the `ci.yml` workflow from the frontend. Redundant steps were eliminated. +### Step 3. Restore workflows -The following actions have been successfully combined: +#### New actions -- actionlint ✅ -- bundle_size.yml ✅ -- ci_cd.yml (API) + ci.yml (frontend) [merged] ✅ -- Playwright tests from ci.yml (frontend) ✅ -- draft_release.yml ✅ -- generate_pot.yml ✅ -- gh_pages.yml ✅ -- migration_safety_warning.yml ✅ -- subscribe_to_label.yml ✅ -- label_new_pr.yml ✅ -- pr_closed.yml ✅ -- pr_label_check.yml ✅ -- new_issues.yml ✅ -- pr_ping.yml ✅ +To clean up the workflows we will define three new actions. The code for all three is available at [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). -The following have not been verified to work: +1. `setup-env` to setup Just, Node.js and Python or a subset of these. +1. `load-img` to download Docker images into `/tmp` and load them in Docker. +1. `build-docs` to build and merge Sphinx, Storybook and Tailwind Config Viewer. -- renovate.yml -- rollback.yml -- ghcr.yml -- push_docker_image.yml +#### Update workflows -With this done, the development on the frontend can continue inside the subdirectory. +With this done, the development on the API and frontend can continue inside their subdirectories. The development of both parts will be independent. At least until we reach [long-term consolidation](#step-5-long-term-consolidation). -### Step 3. Buff the rough edges +### Step 3. Housekeeping and DX cleanup There will be a few rough edges that I cannot foresee and we can continuously fix those as we spot them. But up to this point we should be in a position where we can continue to build the API and the frontend independently but from one repo. 1. The action `banyan/auto-label` will need to be configured (`auto-label.json`) to add the "stack: \*" labels based on the modified directory. -### Step 4. Integration - -This is the long term combination of code for the frontend and the API. - -#### Combined lint - -All lint steps can be combined in `.pre-commit-config.yaml`. This also simplifies the CI jobs can now be merged. - -See the combined lint in action at [monopenverse](https://github.com/dhruvkb/monopenverse). - -### Step 5. Documentation merge +### Step 4. Documentation merge The following documentation files will need reorganisation or merge. @@ -207,3 +272,9 @@ The following documentation files will need reorganisation or merge. I will need more information about this because IANAL. - LICENSE (both repos) + +### Step 5. Long-term consolidation + +This is the long term combination of code for the frontend and the API. Ideas like end-to-end testing go here. This is beyond my imagination at the moment, and more importantly, beyond the scope of this RFC. It will surely be covered in future RFCs. + +Thanks for reading and providing feedback. From 205ca4181c98e00439aff920e0afba808364cbc1 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 15 Dec 2022 12:47:49 +0400 Subject: [PATCH 07/12] Update RFC with more details, specifically about deployment --- rfcs/20221124-monorepo.md | 72 ++++++++++++++++++++++++++++++++++----- 1 file changed, 63 insertions(+), 9 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index e2e7a722ba1..b1b05b3fc2c 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -2,8 +2,6 @@ # RFC: Monorepo -**Status:** 🚧 WIP, comments are welcome nonetheless - ## Reviewers - [ ] @zackkrida @@ -11,9 +9,7 @@ ## Rationale -For a comprehensive discussion about the pros, the cons and the counterpoints to each see [discussion](https://github.com/WordPress/openverse/issues/192). This is not the purpose of this RFC. - -This RFC summarily lists the benefits and then, with the twin assumptions of a monorepo being ultimately beneficial and the decision to migrate being finalised in the above discussion, proceeds to go into the implementation details. +For a comprehensive discussion about the pros, the cons and the counterpoints to each see [discussion](https://github.com/WordPress/openverse/issues/192). Some of the more nuanced points are listed below, biased towards the overall benefits of a monorepo to justify the RFC. This RFC also proceeds to go into the implementation details hoping that the benefits are cumulatively enough of an improvement to convince everyone to migrate. ### Benefits of monorepo @@ -79,13 +75,17 @@ First we will merge the API and the frontend repos into `WordPress/openverse`. T 1. The merge of two JavaScript codebases provides fertile ground for testing `pnpm` workspaces. + - It also allows us to merge the browser extension later and split the design system/component library stuff into a separate package. + 1. The API is already organised by stack folders so the `frontend/` directory will fit right in with the others like `api/` and `ingestion_server/`. Similarly the scripts repo is nicely organised in folders, reducing conflicts. 1. The API and frontend share identical tooling for Git hooks, linting and formatting. We will fight our tools less and encounter minimal friction. - - The frontend's approach for `pre-commit` inspired the RFC for expaning this type of usage to the API as well! + - The frontend's approach for `pre-commit` expanded this type of usage to the API as well! + + - We're expanding the use of double-quoted strings to JavaScript to further unify our style guides. -1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server as long as we can turn off all network calls and enable 100% reliable output. +1. The entire system can be integration tested during releases. The real API, populated with test data, can replace the Talkback server as long as we disable network calls and make output deterministic. The `WordPress/openverse` repo will absorb the `WordPress/openverse-api` and `WordPress/openverse-frontend` repos. The `WordPress/openverse-catalog` will also be merged, _later_. @@ -101,6 +101,16 @@ The first step will be to release the API and frontend, call a code freeze on bo This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to becoming workable again, and another week (for us to iron out any gaps in the docs and references). +##### Timeline breakdown + +- Day 1: Merging the repos and resolving conflicts, restoring broken workflows except deploys +- Day 2: Restoring deployment workflows including staging auto-deploy +- Day 3: Transfer of issues from individual repos to monorepo +- Day 4: Documentation fixes +- Day 5: Housekeeping + +The second week is planned as a buffer in case any of these tasks ends up taking more time than a day, if something breaks or if someone falls ill etc. The ideal scenario is that we're completely back next week, the worst one takes two. + Note that in the transition period nothing will break. The old repos will continue to exists as they are, till we ensure everything works and then we archive the current split repos. ### Step 1: Merge with histories @@ -232,10 +242,32 @@ With this done, we can archive the API and frontend repo. An optional notice may #### Combine linting -All lint steps can be combined in `.pre-commit-config.yaml`. This also simplifies the CI jobs can now be merged. +All lint steps can be combined in `.pre-commit-config.yaml`. This also implies the CI jobs can now be merged. + +1. Remove pre-commit scripts from `frontend/package.json` and the `install-pre-commit.sh` script. + +1. Remove `lint` job from CI, there are plenty of those. #### `pnpm` workspace +1. Move `frontend/.pnpmfile.cjs` outside to the root directory, update the reference to `frontend/package.json`. + +1. Remove `frontend/.npmrc` created earlier because `pnpm` will automatically use the one in the root of the workspace. + +1. Create `pnpm-workspaces.yaml` file, see https://github.com/dhruvkb/monoverse/blob/main/.pre-commit-config.yaml. + +1. Update the `package.json` files, see the following: + + - https://github.com/dhruvkb/monoverse/blob/main/package.json + - https://github.com/dhruvkb/monoverse/blob/main/frontend/package.json + - https://github.com/dhruvkb/monoverse/blob/main/automations/js/package.json + +1. `pnpm i` in the monorepo root. + +1. Update the recipe `pnpm` in `frontend/justfile` to include `--filter "openverse-frontend"`. + +1. `git commit -m "Setup workspaces"` + ### Step 3. Restore workflows #### New actions @@ -248,7 +280,29 @@ To clean up the workflows we will define three new actions. The code for all thr #### Update workflows -With this done, the development on the API and frontend can continue inside their subdirectories. The development of both parts will be independent. At least until we reach [long-term consolidation](#step-5-long-term-consolidation). +Workflows with major changes: + +- `ci_cd.yml` from the API will absorb `ci.yml` from the frontend +- `lint.yml` will be deleted + +Updates: + +- `migration_safety_warning.yml` +- `generate_pot.yml` + +With this done, the development on the API and frontend can continue inside their subdirectories. The development of both parts will be independent, at least until we reach [long-term consolidation](#step-5-long-term-consolidation). + +#### Deployment + +##### Staging + +The soon-to-be-ECS-based API and ECS-based frontend will continue to deploy to staging via the CI + CD pipeline, with deployment starting as soon as all CI checks have passed. They will use similar code as the frontend auto-deploy for staging used currently. + +These will be separate jobs with specific [path-based filters](https://github.com/dorny/paths-filter). + +##### Production + +For production, we will not be able to use GitHub Releases and will have to use a manually-triggered workflow to build the assets and tag them. The tag can be set via a workflow input (simple) or can be calculated based on the update scope of major, minor or patch (not as simple). ### Step 3. Housekeeping and DX cleanup From 53fe443c13ae75938124f8cfd4f857c464a836f4 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Thu, 15 Dec 2022 13:14:43 +0400 Subject: [PATCH 08/12] Format file as per Prettier --- rfcs/20221124-monorepo.md | 228 ++++++++++++++++++++++++++++---------- 1 file changed, 170 insertions(+), 58 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index b1b05b3fc2c..060207b6a03 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -9,113 +9,194 @@ ## Rationale -For a comprehensive discussion about the pros, the cons and the counterpoints to each see [discussion](https://github.com/WordPress/openverse/issues/192). Some of the more nuanced points are listed below, biased towards the overall benefits of a monorepo to justify the RFC. This RFC also proceeds to go into the implementation details hoping that the benefits are cumulatively enough of an improvement to convince everyone to migrate. +For a comprehensive discussion about the pros, the cons and the counterpoints to +each see [discussion](https://github.com/WordPress/openverse/issues/192). Some +of the more nuanced points are listed below, biased towards the overall benefits +of a monorepo to justify the RFC. This RFC also proceeds to go into the +implementation details hoping that the benefits are cumulatively enough of an +improvement to convince everyone to migrate. ### Benefits of monorepo 1. Single place for everything: - **Current criticism:** We currently have many repos, and issues and PRs spanning all of them. While this makes it easier for us as maintainers to focus our efforts, it's not easy for contributors. Let's say you were a new contributor looking for good-first Python issues. We shouldn't expect them to search in 3 repos `openverse`, `openverse-api` and `openverse-catalog`. We address this by making tools like Overvue or using search terms like this: + **Current criticism:** We currently have many repos, and issues and PRs + spanning all of them. While this makes it easier for us as maintainers to + focus our efforts, it's not easy for contributors. Let's say you were a new + contributor looking for good-first Python issues. We shouldn't expect them to + search in 3 repos `openverse`, `openverse-api` and `openverse-catalog`. We + address this by making tools like Overvue or using search terms like this: ``` is:open is:issue repo:WordPress/openverse-catalog repo:WordPress/openverse repo:WordPress/openverse-api repo:WordPress/openverse-frontend repo:WordPress/openverse-infrastructure sort:updated-desc ``` - The search term above perfectly illustrates the problem: we forgot about the extension. It's unwieldy and hard to quickly reach and share. + The search term above perfectly illustrates the problem: we forgot about the + extension. It's unwieldy and hard to quickly reach and share. - **Monorepo solution:** We could use GitHub's own filters to narrow down what we're looking for. + **Monorepo solution:** We could use GitHub's own filters to narrow down what + we're looking for. 1. Meta-issues: - **Current criticism:** If an issue spans more than a single layer of the stack, we need to open a meta issue in `WordPress/openverse`, open sub-issues in each of the different repos, then manually close meta issues after the sub-issues are closed. Same goes for PRs. We make individual PRs for every layer and then have to cross-reference them so that reviewers can see the full picture. Meta issues are good when a work needs to split into subtasks, but they are not good for cross-repo work splitting, especially when the work happens completely outside the knowledge of the meta-issue. + **Current criticism:** If an issue spans more than a single layer of the + stack, we need to open a meta issue in `WordPress/openverse`, open sub-issues + in each of the different repos, then manually close meta issues after the + sub-issues are closed. Same goes for PRs. We make individual PRs for every + layer and then have to cross-reference them so that reviewers can see the + full picture. Meta issues are good when a work needs to split into subtasks, + but they are not good for cross-repo work splitting, especially when the work + happens completely outside the knowledge of the meta-issue. - **Monorepo solution:** A monorepo allows our cross-layer PRs to be viewed more holistically and be reviewed as a complete change. + **Monorepo solution:** A monorepo allows our cross-layer PRs to be viewed + more holistically and be reviewed as a complete change. 1. No more sync: - Current criticism: We use complex sync workflows to keep files in sync. Some workflows need to by synced to some repos only. Some workflows shouldn't even be in the repo they're synced from. Some files need subtle differences so we compile Jinja templates for them. We also sync GitHub labels and branch management rules. It's a mess (I would know!). + Current criticism: We use complex sync workflows to keep files in sync. Some + workflows need to by synced to some repos only. Some workflows shouldn't even + be in the repo they're synced from. Some files need subtle differences so we + compile Jinja templates for them. We also sync GitHub labels and branch + management rules. It's a mess (I would know!). - **Monorepo solution:** A monorepo would just eliminates all of this and saves the time and effort that goes into maintaining these systems. + **Monorepo solution:** A monorepo would just eliminates all of this and saves + the time and effort that goes into maintaining these systems. 1. Unified documentation: - **Current criticism:** Having many repos, each with its own doc site means two things. Common docs such as contribution process needs to be repeated several times and repo-specific docs get siloed and can only reference each other with external links. Also changing docs in one repo will break any links pointing to it. + **Current criticism:** Having many repos, each with its own doc site means + two things. Common docs such as contribution process needs to be repeated + several times and repo-specific docs get siloed and can only reference each + other with external links. Also changing docs in one repo will break any + links pointing to it. - **Monorepo solution:** A better system would be one cohesive doc site, for which the API already has a framework that other repos can just use. + **Monorepo solution:** A better system would be one cohesive doc site, for + which the API already has a framework that other repos can just use. 1. Infra included: - **Current criticism:** Our deployment workflows have code duplication. Secrets are stored in lots of repos, we keep secrets synced using Terraform. Containers are used in the infra repo but published in their individual repos. + **Current criticism:** Our deployment workflows have code duplication. + Secrets are stored in lots of repos, we keep secrets synced using Terraform. + Containers are used in the infra repo but published in their individual + repos. - **Monorepo solution:** Monorepo enables the infra to coexist with the code (albeit in a separate module). Apart from the (encrypted) private secrets, the IaC could be open-sourced similar to the rest of the codebase. Our deployment workflows can share code and deployment secrets. + **Monorepo solution:** Monorepo enables the infra to coexist with the code + (albeit in a separate module). Apart from the (encrypted) private secrets, + the IaC could be open-sourced similar to the rest of the codebase. Our + deployment workflows can share code and deployment secrets. 1. GitHub Milestones: - Milestones are confined by repository boundaries. To have milestones that cover issues in different layers of our stack, the only way is for them to be in a monorepo. This is a limitation imposed by GitHub and there is no workaround for this. + Milestones are confined by repository boundaries. To have milestones that + cover issues in different layers of our stack, the only way is for them to be + in a monorepo. This is a limitation imposed by GitHub and there is no + workaround for this. -The overarching theme is that there are workarounds for everything. We have been working with split repos quite productively for over a year. My proposition is the the monorepo solutions are better than workarounds. +The overarching theme is that there are workarounds for everything. We have been +working with split repos quite productively for over a year. My proposition is +the the monorepo solutions are better than workarounds. -The [integration](#step-4-integration) section in the latter part of the document describes more interesting outcomes made possible by the monorepo. They may or may not be exclusive to monorepos but they're surely made easier by it. +The [integration](#step-4-integration) section in the latter part of the +document describes more interesting outcomes made possible by the monorepo. They +may or may not be exclusive to monorepos but they're surely made easier by it. ## Migration path -First we will merge the API and the frontend repos into `WordPress/openverse`. This decision was made for the following reasons. +First we will merge the API and the frontend repos into `WordPress/openverse`. +This decision was made for the following reasons. -1. API and frontend are tightly linked. The frontend is a direct consumer of what the API produces. +1. API and frontend are tightly linked. The frontend is a direct consumer of + what the API produces. -1. The API and frontend form the "service" side of Openverse that directly faces the users (both API consumers and Search engine users). +1. The API and frontend form the "service" side of Openverse that directly faces + the users (both API consumers and Search engine users). -1. The frontend uses ECS deployments and the API is well on the same track. This makes it possible for them to share some deployment code. +1. The frontend uses ECS deployments and the API is well on the same track. This + makes it possible for them to share some deployment code. -1. I am very familiar with the scripts repo, the API and the frontend so merging them would be easier. Adding a third component would make the task daunting. +1. I am very familiar with the scripts repo, the API and the frontend so merging + them would be easier. Adding a third component would make the task daunting. -1. Merging incurs a productivity hit for the initial transition. So merging everything in one swoop is not ideal. While we merge these three, effort can be diverted to the catalog. +1. Merging incurs a productivity hit for the initial transition. So merging + everything in one swoop is not ideal. While we merge these three, effort can + be diverted to the catalog. -1. The API’s comprehensive tooling for developer documentation can benefit frontend devs and create a unified docs site for contributors. +1. The API’s comprehensive tooling for developer documentation can benefit + frontend devs and create a unified docs site for contributors. -1. The merge of two JavaScript codebases provides fertile ground for testing `pnpm` workspaces. +1. The merge of two JavaScript codebases provides fertile ground for testing + `pnpm` workspaces. - - It also allows us to merge the browser extension later and split the design system/component library stuff into a separate package. + - It also allows us to merge the browser extension later and split the design + system/component library stuff into a separate package. -1. The API is already organised by stack folders so the `frontend/` directory will fit right in with the others like `api/` and `ingestion_server/`. Similarly the scripts repo is nicely organised in folders, reducing conflicts. +1. The API is already organised by stack folders so the `frontend/` directory + will fit right in with the others like `api/` and `ingestion_server/`. + Similarly the scripts repo is nicely organised in folders, reducing + conflicts. -1. The API and frontend share identical tooling for Git hooks, linting and formatting. We will fight our tools less and encounter minimal friction. +1. The API and frontend share identical tooling for Git hooks, linting and + formatting. We will fight our tools less and encounter minimal friction. - - The frontend's approach for `pre-commit` expanded this type of usage to the API as well! + - The frontend's approach for `pre-commit` expanded this type of usage to the + API as well! - - We're expanding the use of double-quoted strings to JavaScript to further unify our style guides. + - We're expanding the use of double-quoted strings to JavaScript to further + unify our style guides. -1. The entire system can be integration tested during releases. The real API, populated with test data, can replace the Talkback server as long as we disable network calls and make output deterministic. +1. The entire system can be integration tested during releases. The real API, + populated with test data, can replace the Talkback server as long as we + disable network calls and make output deterministic. -The `WordPress/openverse` repo will absorb the `WordPress/openverse-api` and `WordPress/openverse-frontend` repos. The `WordPress/openverse-catalog` will also be merged, _later_. +The `WordPress/openverse` repo will absorb the `WordPress/openverse-api` and +`WordPress/openverse-frontend` repos. The `WordPress/openverse-catalog` will +also be merged, _later_. ### Reference -I'm following the steps listed below in a fork at [@dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse/) [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). You can refer to the fork, but note that it is a comes from a place of haste and has not been treated with the same level of love and care that the final treatment will receive. +I'm following the steps listed below in a fork at +[@dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse/) +[@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). You can refer to +the fork, but note that it is a comes from a place of haste and has not been +treated with the same level of love and care that the final treatment will +receive. ### Step 0: Prerequisites #### Get the timing right -The first step will be to release the API and frontend, call a code freeze on both of them and pause work on both. This is to prevent the repos from continuing to drift as we merge a snapshot of them into the `WordPress/openverse` repo. +The first step will be to release the API and frontend, call a code freeze on +both of them and pause work on both. This is to prevent the repos from +continuing to drift as we merge a snapshot of them into the +`WordPress/openverse` repo. -This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to becoming workable again, and another week (for us to iron out any gaps in the docs and references). +This can prove difficult given how productive our team is, so we will need to +channel this productivity towards the catalog in the meantime. I can foresee the +end-to-end migration taking one week (ideal scenario) to becoming workable +again, and another week (for us to iron out any gaps in the docs and +references). ##### Timeline breakdown -- Day 1: Merging the repos and resolving conflicts, restoring broken workflows except deploys +- Day 1: Merging the repos and resolving conflicts, restoring broken workflows + except deploys - Day 2: Restoring deployment workflows including staging auto-deploy - Day 3: Transfer of issues from individual repos to monorepo - Day 4: Documentation fixes - Day 5: Housekeeping -The second week is planned as a buffer in case any of these tasks ends up taking more time than a day, if something breaks or if someone falls ill etc. The ideal scenario is that we're completely back next week, the worst one takes two. +The second week is planned as a buffer in case any of these tasks ends up taking +more time than a day, if something breaks or if someone falls ill etc. The ideal +scenario is that we're completely back next week, the worst one takes two. -Note that in the transition period nothing will break. The old repos will continue to exists as they are, till we ensure everything works and then we archive the current split repos. +Note that in the transition period nothing will break. The old repos will +continue to exists as they are, till we ensure everything works and then we +archive the current split repos. ### Step 1: Merge with histories -This should be quick save for a few merge conflicts. In case of conflict copy the code from [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). +This should be quick save for a few merge conflicts. In case of conflict copy +the code from [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). Add remotes. @@ -158,8 +239,10 @@ $ git commit -m "Merge branch 'main' of openverse-api" - https://github.com/dhruvkb/monoverse/blob/main/.gitignore - https://github.com/dhruvkb/monoverse/blob/main/api/.gitignore - https://github.com/dhruvkb/monoverse/blob/main/ingestion_server/.gitignore -- https://github.com/dhruvkb/monoverse/blob/main/ingestion_server/test/.gitignore (unchanged) -- https://github.com/dhruvkb/monoverse/blob/main/load_testing/.gitignore (unchanged) +- https://github.com/dhruvkb/monoverse/blob/main/ingestion_server/test/.gitignore + (unchanged) +- https://github.com/dhruvkb/monoverse/blob/main/load_testing/.gitignore + (unchanged) - https://github.com/dhruvkb/monoverse/blob/main/nginx/.gitignore (unchanged) #### Merging frontend @@ -200,7 +283,8 @@ Switch to the monorepo. $ git pull frontend main --allow-unrelated-histories ``` -Since we fixed these files when merging the API, almost all these conflicts are redundant. +Since we fixed these files when merging the API, almost all these conflicts are +redundant. | Expected conflicts | Resolution | | ----------------------- | ------------------------------- | @@ -214,9 +298,13 @@ Since we fixed these files when merging the API, almost all these conflicts are #### Housekeeping -1. Create "stack: \*" labels to help with issue and PR management. Spoiler/foreshadowing: these labels will be used for more things later. +1. Create "stack: \*" labels to help with issue and PR management. + Spoiler/foreshadowing: these labels will be used for more things later. -1. Migrate issues from `WordPress/openverse-frontend` and `WordPress/openverse-api`. We can directly transfer the issues, retaining all their comments. Apply the "stack: frontend" / "stack: backend" label to them after moving. +1. Migrate issues from `WordPress/openverse-frontend` and + `WordPress/openverse-api`. We can directly transfer the issues, retaining all + their comments. Apply the "stack: frontend" / "stack: backend" label to them + after moving. ```bash # Substitute repo with WordPress/openverse-frontend and WordPress/openverse-api @@ -236,25 +324,31 @@ $ gh api \ -R WordPress/ ``` -With this done, we can archive the API and frontend repo. An optional notice may be added to the `README.md` files for clarification before archiving. +With this done, we can archive the API and frontend repo. An optional notice may +be added to the `README.md` files for clarification before archiving. ### Step 1. Restore functionality #### Combine linting -All lint steps can be combined in `.pre-commit-config.yaml`. This also implies the CI jobs can now be merged. +All lint steps can be combined in `.pre-commit-config.yaml`. This also implies +the CI jobs can now be merged. -1. Remove pre-commit scripts from `frontend/package.json` and the `install-pre-commit.sh` script. +1. Remove pre-commit scripts from `frontend/package.json` and the + `install-pre-commit.sh` script. 1. Remove `lint` job from CI, there are plenty of those. #### `pnpm` workspace -1. Move `frontend/.pnpmfile.cjs` outside to the root directory, update the reference to `frontend/package.json`. +1. Move `frontend/.pnpmfile.cjs` outside to the root directory, update the + reference to `frontend/package.json`. -1. Remove `frontend/.npmrc` created earlier because `pnpm` will automatically use the one in the root of the workspace. +1. Remove `frontend/.npmrc` created earlier because `pnpm` will automatically + use the one in the root of the workspace. -1. Create `pnpm-workspaces.yaml` file, see https://github.com/dhruvkb/monoverse/blob/main/.pre-commit-config.yaml. +1. Create `pnpm-workspaces.yaml` file, see + https://github.com/dhruvkb/monoverse/blob/main/.pre-commit-config.yaml. 1. Update the `package.json` files, see the following: @@ -264,7 +358,8 @@ All lint steps can be combined in `.pre-commit-config.yaml`. This also implies t 1. `pnpm i` in the monorepo root. -1. Update the recipe `pnpm` in `frontend/justfile` to include `--filter "openverse-frontend"`. +1. Update the recipe `pnpm` in `frontend/justfile` to include + `--filter "openverse-frontend"`. 1. `git commit -m "Setup workspaces"` @@ -272,7 +367,9 @@ All lint steps can be combined in `.pre-commit-config.yaml`. This also implies t #### New actions -To clean up the workflows we will define three new actions. The code for all three is available at [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). +To clean up the workflows we will define three new actions. The code for all +three is available at +[@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). 1. `setup-env` to setup Just, Node.js and Python or a subset of these. 1. `load-img` to download Docker images into `/tmp` and load them in Docker. @@ -290,26 +387,38 @@ Updates: - `migration_safety_warning.yml` - `generate_pot.yml` -With this done, the development on the API and frontend can continue inside their subdirectories. The development of both parts will be independent, at least until we reach [long-term consolidation](#step-5-long-term-consolidation). +With this done, the development on the API and frontend can continue inside +their subdirectories. The development of both parts will be independent, at +least until we reach [long-term consolidation](#step-5-long-term-consolidation). #### Deployment ##### Staging -The soon-to-be-ECS-based API and ECS-based frontend will continue to deploy to staging via the CI + CD pipeline, with deployment starting as soon as all CI checks have passed. They will use similar code as the frontend auto-deploy for staging used currently. +The soon-to-be-ECS-based API and ECS-based frontend will continue to deploy to +staging via the CI + CD pipeline, with deployment starting as soon as all CI +checks have passed. They will use similar code as the frontend auto-deploy for +staging used currently. -These will be separate jobs with specific [path-based filters](https://github.com/dorny/paths-filter). +These will be separate jobs with specific +[path-based filters](https://github.com/dorny/paths-filter). ##### Production -For production, we will not be able to use GitHub Releases and will have to use a manually-triggered workflow to build the assets and tag them. The tag can be set via a workflow input (simple) or can be calculated based on the update scope of major, minor or patch (not as simple). +For production, we will not be able to use GitHub Releases and will have to use +a manually-triggered workflow to build the assets and tag them. The tag can be +set via a workflow input (simple) or can be calculated based on the update scope +of major, minor or patch (not as simple). ### Step 3. Housekeeping and DX cleanup -There will be a few rough edges that I cannot foresee and we can continuously fix those as we spot them. But up to this point we should be in a position where -we can continue to build the API and the frontend independently but from one repo. +There will be a few rough edges that I cannot foresee and we can continuously +fix those as we spot them. But up to this point we should be in a position where +we can continue to build the API and the frontend independently but from one +repo. -1. The action `banyan/auto-label` will need to be configured (`auto-label.json`) to add the "stack: \*" labels based on the modified directory. +1. The action `banyan/auto-label` will need to be configured (`auto-label.json`) + to add the "stack: \*" labels based on the modified directory. ### Step 4. Documentation merge @@ -329,6 +438,9 @@ I will need more information about this because IANAL. ### Step 5. Long-term consolidation -This is the long term combination of code for the frontend and the API. Ideas like end-to-end testing go here. This is beyond my imagination at the moment, and more importantly, beyond the scope of this RFC. It will surely be covered in future RFCs. +This is the long term combination of code for the frontend and the API. Ideas +like end-to-end testing go here. This is beyond my imagination at the moment, +and more importantly, beyond the scope of this RFC. It will surely be covered in +future RFCs. Thanks for reading and providing feedback. From 5763d16a37024a39e28da2fee2e130fe518dccc2 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Fri, 16 Dec 2022 12:30:25 +0400 Subject: [PATCH 09/12] Update the instruction about copying --- rfcs/20221124-monorepo.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 060207b6a03..9c60bda5186 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -195,8 +195,12 @@ archive the current split repos. ### Step 1: Merge with histories -This should be quick save for a few merge conflicts. In case of conflict copy -the code from [@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/). +This should be quick save for a few merge conflicts. In case of conflict refer +to the resolution adopted by +[@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/), and apply the +resolution to the files. Do not blindly copy code from +[@dhruvkb/monoverse](https://github.com/dhruvkb/monoverse/) as it might be out +of sync with the current state of the files. Add remotes. From f88693ac002ce8db53435275c04aa927f00bb125 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Fri, 16 Dec 2022 12:37:01 +0400 Subject: [PATCH 10/12] Add the migration notice for the repos Co-authored-by: Zack Krida --- rfcs/20221124-monorepo.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 9c60bda5186..0d94042eebb 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -328,8 +328,14 @@ $ gh api \ -R WordPress/ ``` -With this done, we can archive the API and frontend repo. An optional notice may -be added to the `README.md` files for clarification before archiving. +With this done, we can archive the API and frontend repo. The following notice +will be added to the `README.md` files for clarification before archiving. + +> **Note** +> +> This repository has moved to +> [WordPress/openverse](https://github.com/wordpress/openverse) as part of a +> monorepo. ### Step 1. Restore functionality From 91e056b86f9125319c37f58b153204abfeaeca98 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Fri, 16 Dec 2022 12:40:25 +0400 Subject: [PATCH 11/12] Fix bad grammar Co-authored-by: Zack Krida --- rfcs/20221124-monorepo.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index 0d94042eebb..aa7f37e0fb3 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -59,7 +59,7 @@ improvement to convince everyone to migrate. compile Jinja templates for them. We also sync GitHub labels and branch management rules. It's a mess (I would know!). - **Monorepo solution:** A monorepo would just eliminates all of this and saves + **Monorepo solution:** A monorepo eliminates all of this and saves the time and effort that goes into maintaining these systems. 1. Unified documentation: From 3b429c3ce2aede8e34c25ee5ff525e3f8f620c66 Mon Sep 17 00:00:00 2001 From: Dhruv Bhanushali Date: Fri, 16 Dec 2022 12:44:02 +0400 Subject: [PATCH 12/12] Fix linter violations --- rfcs/20221124-monorepo.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/rfcs/20221124-monorepo.md b/rfcs/20221124-monorepo.md index aa7f37e0fb3..5e85a2ba050 100644 --- a/rfcs/20221124-monorepo.md +++ b/rfcs/20221124-monorepo.md @@ -59,8 +59,8 @@ improvement to convince everyone to migrate. compile Jinja templates for them. We also sync GitHub labels and branch management rules. It's a mess (I would know!). - **Monorepo solution:** A monorepo eliminates all of this and saves - the time and effort that goes into maintaining these systems. + **Monorepo solution:** A monorepo eliminates all of this and saves the time + and effort that goes into maintaining these systems. 1. Unified documentation: