Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Monorepo #336

Merged
merged 13 commits into from
Feb 20, 2023
179 changes: 179 additions & 0 deletions rfcs/20221124-monorepo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
↖️ Table of Contents

# RFC: Monorepo

**Status:** 🚧 WIP, comments are welcome nonetheless
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still accurate? Or can we consider this complete? I would suggest if it's ready we add a completion date that review should be done by here and in the PR description. Moving forward we'll use a two-week deadline. This has already been open for 12-ish days, as I recall, but some folks are starting to share concerns, so I would suggest December 14th as a due date allowing for sufficient time for additional feedback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still accurate, I am working on two things


## Reviewers

- [ ] @zackkrida
- [ ] @sarayourfriend

## Rationale

For a comprehensive discussion about the pros, the cons and the counterpoints to each see [discussion](https://github.com/WordPress/openverse/issues/192). This is not the purpose of this RFC.

This RFC summarily lists the benefits and then, with the twin assumptions of a monorepo being ultimately beneficial and the decision to migrate being finalised in the above discussion, proceeds to go into the implementation details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must have been a misunderstanding here or somewhere else, as the previous conversation only stated there was interest in continuing to explore the possibilities an Openverse monorepo will bring to the table, and an RFC would help to make the decision.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I am sorry for assuming that we had a unanimous agreement. I have expanded the benefits section in 6ef0168 to present a stronger argument.


### Exclusive benefits of monorepo

This only includes things that cannot be accomplished without the use of a monorepo.

1. Single place to go for issues, PRs and all activity. Currently tickets are scattered across several repos, and any tickets that could benefit more than a single layer must be opened in each of the different repos.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not something exclusive of a monorepo, you can already view all the issues of Openverse repositories aggregated in a single page, including the private infrastructure repo.

Copy link
Member

@zackkrida zackkrida Dec 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point. There are some limitations to the GitHub link you shared:

  • It doesn't allow for bulk actions like the individual repository view does:
    CleanShot 2022-12-06 at 12 07 22
  • I use the GitHub CLI quite often and using gh issue list in the monorepo is a nice way to see all open issues. I currently run this in multiple repos
  • While a minor point, the link is not part of the GitHub UI, it's something that needs to be bookmarked or remembered. Which might be less useful for new contributors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • What kind of bulk actions do you tend to do or would like to be able to do? I'm trying to imagine the use for mixed issues/PRs.
  • I think the issue with the gh CLI can probably be solved with just commands on the openverse repo, which would require some configurations of course, but do you need to see often all the open issues? For development one only typically need to focus on a well-defined subset.
  • It's part of the GitHub UI in the sense that the link is always on the GH header, you only need to add the filters for the repos. I don't think new contributors need to look at all the issues to start either, approaching one individual level of the stack is more manageable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@krysal I don't disagree with your comments. There are potential solutions and workarounds for many of the things that a monorepo offers. I don't think the benefits of a monorepo need to be things that only a monorepo can do, but things that a monorepo might do a bit better also need to be considered.

Regarding contributors looking at lists of issues, I don't think they need to see all issues, either. But consider the ways users might look at the issues page. Speaking for myself, when I hear about a new project I'd like to contribute to I tend to:

  • Click on the 'issues' tab in the GitHub UI to look at the top 5-10 issues and see what's actively being discussed, worked on, and engaged with in a project
  • Sort by "most commented" to see what the biggest or most popular discussions are

I think it's fundamentally simpler for a contributor who wants to look at Openverse issues to go to one repo and click on the "issues" tab in the UI, rather than type filters or click on a link which was found externally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are potential solutions and workarounds for many of the things that a monorepo offers

For me it isn't just this, but we should also be clear that there are trade-offs to using a monorepo, both long term and temporary. To me it's important that we agree that any benefits we think we'll get from it are worth the trade offs. That's why the conversation cannot just be "what are the benefits", it must be "what are the benefits and why are they worth making the change rather than working with what we have now". Monorepos can make a lot of meta infrastructural things more complex. The ci_cd workflow in the API repository is already complicated. A lot of actions will now run that don't need to, like Playwright for changes to the API. Playwright takes a long time to run. Is that a trade off we want? If we don't, we'll need to make a files filter, but that can only be applied to an entire workflow, so now we have to have separate workflows for different aspects of the projects, which is something we intentionally worked away from at one point. Is that something we are willing to accept?

A change like this is not just pure, uncomplicated benefits. It's not that we can do everything we're doing now but easier. There will be trade offs. Even just a bigger repository to check out can be an issue, more history to dig through, more complicated releases, etc.

Copy link
Member

@zackkrida zackkrida Dec 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree. The main thing I wanted was to express that the benefits, and downsides, needn't be things that are exclusive to monorepos, things that only monorepos can offer or detract from, but things that they can make easier or harder.

That the important part isn't saying "a monorepo can/can't do this", rather "a monorepo can/can't do this better/worse".

We should definitely discuss tradeoffs and downsides more explicitly here.


1. Singular copy (different from synced independent copies) of scaffolding code such as Git hooks, lint rules and common workflows. This is distinctly better than elaborate sync workflows.

1. Central place for all technical documentation, enabling documentation for different parts of the stack to cross-reference other pieces and stay current with changes in other places.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is wrong with old normal links? We can already cross-reference info if it's necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hyperlinks can break when documentation is reorganised or reworded. Having references to the changed docs being in the same repo makes it easy for the PR author to update it as a part of the same PR and prevent the links from rotting or breaking.

Sphinx explains it better than I can.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been a pain point for documenting our data models and ingestion/data refresh processes. IMO that's one concrete place where documentation (and the process of generating/updating that documentation) could improve in a monorepo.


1. Enables the infra to deploy the code to coexist with the code itself. Apart from the private secrets that will still need to be encrypted, the IaC files could be organised identical to the code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the IaC files could be organised identical to the code.

Just a nit but this isn't entirely true. We could maybe spread some Terraform modules out this way but it would be a pain to have so many places to search for modules. If the Terraform code will be merged into the monorepo as well then it'd be better off assuming that it will live, in totality, in a separate directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, my lack of knowledge in the IaC space is really showing here. I would appreciate if you could add any infra issues (especially any dealbreakers) and any infra benefits that you see popping up due to this merge. Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an infra perspective I don't think there are any benefits really. The only thing I could see is if we did this issue: https://github.com/WordPress/openverse-infrastructure/issues/220. The benefit of that issue could be a more easily reusable workflow with better variables in the deployment workflows we have now. Basically everything that is used to define the deployment workflows are variables that we could theoretically move into GitHub, thought it's just more manual management of those details. We could try to sync them other ways like with a GitHub secret of a JSON blob that gets passed through so the workflow call is something like:

- uses: ./actions/deploy.yml
  with:
    config: ${{ secrets.NUXT_DEPLOYMENT_CONFIG }}

And then the workflow parses the JSON to get the configuration.

Note: For folks without access to the infrastructure repository, the issue above describes switching to a callable workflow instead of a composite action. I've found composite actions a little more tedious so I avoided trying to make the current deployment workflows super modular. I think a callable workflow would be easier to work with for a lot of reasons and could open up the possibility of making a more module/generic single-deployment workflow file a little more realistic.

We could do all of this in our current setup, but a monorepo would end up with less places to sync the secrets to and less places to put the deployment action.

Actually, that reminds me, if the infrastructure is in the same repository, the deployment workflow outputs could be automatically merged whenever changes are made to them. Right now we have to manually sync them from the infrastructure repository to wherever they're needed. So that would be a nice simplification of that process as well (one less thing to remember to do!) 🙂


1. Milestones that can span across multiple layers of the stack are only possible in GitHub. This is a limitation imposed by GitHub and there is no workaround for this.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see manual meta issues as good enough workaround.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see manual meta issues as good enough workaround.

They definitely might be. We would need to establish some conventions and patterns for using them more effectively, but GitHub did recently fix some bugs with showing the titles of issues from other repos. Here's an example of a cross-repo meta issue:

#341

It's really nice! Meta issues are also much more flexible than milestones, as milestones do not allow for comments or discussion. Regardless of switching to a monorepo or not we might want to use meta issues more anyway.

One minor benefit to the monorepo is that you get autocompletion of issues in the current repo, but not for cross-repo issues:

CleanShot 2022-12-06 at 12 12 32

Copy link
Member Author

@dhruvkb dhruvkb Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta issues are great but it's a bit of a manual effort to look for meta-issues to close when each of the repos has completed their fraction of the meta-issue. It's happened several times that meta issues in WordPress/openverse remain open even after each of the repos has implemented the required change because we forgot there was a meta issue at all.


The [integration](#step-4-integration) section in the latter part of the document describes more interesting outcomes made possible by the monorepo. They may or may not be exclusive to monorepos but they're surely made easier by it.

## Migration path

First we will merge the API and the frontend. This decision was made for the following reasons.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reasoning makes a lot of sense to me, and I'm sold on merging API/frontend first and separately from the Catalog. How much more difficult do you anticipate eventually merging the Catalog to be? Do you think there's value in planning that work first, or can that be tabled until after the API/frontend are merged?

Copy link
Member Author

@dhruvkb dhruvkb Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fairly confident that merging the catalog would not cause major pain later on, mostly because it's another Python project. The API and ingestion server are both independent Python projects that have coexisted in harmony since the beginning so adding a third does not seem problematic.

We could definitely plan for the eventual catalog-API merge, but I abstained from that in the RFC because

  • I am not the most experienced with the catalog so I felt a bit underqualified to write about that in the RFC
  • I didn't want to delay the RFC till I could understand the nuances of the catalog enough to include the portion about merging it


1. API and frontend are tightly linked. The frontend is a direct consumer of what the API produces.

1. The API and frontend form the "service" side of Openverse that directly faces the users (both API consumers and Search engine users).

1. The frontend uses ECS deployments and the API is well on the same track. This makes it possible for them to share some deployment code.

1. To the RFC author, the API and frontend are very familiar so merging them would be easier. Adding a third component would make the task daunting.

1. Merging incurs a productivity hit for the initial transition. So merging everything in one swoop is not ideal.

1. The API’s comprehensive tooling for developer documentation can benefit frontend devs and create a unified docs site for contributors.

1. The API is already organised by stack folders so the `frontend/` directory will fit right in with the others like `api/` and `ingestion_server/`.

1. The API and frontend share identical tooling for Git hooks, linting and formatting. We will fight our tools less and encounter minimal friction.

- In fact, we employ a number of hacks to install and configure pre-commit for the frontend. Merging it with the API eliminates the need for such hacks.

1. The entire system can be integration tested during releases. The real API, populated with test data, can even replace the Talkback server.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea! For this to be feasible, we would need to come up with a definitive solution to preventing the API from making network requests when it executes a search. Right now Playwright tests do not depend at all on any external network and that needs to remain the case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it can be done (maybe something like an environment variable flag to not make requests) and I can envision it being glorious when it works.


The `WordPress/openverse-api` repo will absorb the `WordPress/openverse-frontend` repo. The `WordPress/openverse-catalog` will also be merged, _later_.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be a period of time during which frontend work is happening only in openverse-api? Is the long-term plan to merge everything into openverse-api and then eventually rename that to just openverse?

Copy link
Member Author

@dhruvkb dhruvkb Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The timeline will be

  1. pause work on the frontend
  2. merge WordPress/openverse-frontend into WordPress/openverse-api
  3. transfer issues and other relevant GitHub models
  4. get fronted functional, enable deploys again
  5. resume frontend work in WordPress/opnenverse-api
  6. archive WordPress/opnenverse-frontend

The one week to one fortnight period mentioned in the RFC is the time between step 1 and step 5.


### Reference

I'm following the steps listed below in a fork at [@dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse/). You can refer to the fork, but note that it is a comes from a place of haste and has not been treated with the same level of love and care that the final treatment will receive.

### Step 0: Prerequisites

#### Get the timing right

The first step will be to release the frontend, call a code freeze and pause work on it. This is to prevent the frontend repo from continuing to drift as we merge a snapshot of it with the API.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we schedule a bug bash before this so that we can identify and resolve any outstanding issues before we enter the period of time where we won't be able to without great disruption?

Copy link
Member Author

@dhruvkb dhruvkb Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be ideal (maybe even necessary). Before embarking on this potential two-week freeze, it would be best to have the frontend be in a state that could serve two weeks without maintenance.

If we have no high or critical priority tickets, that would indicate optimal time to work on this move. Getting the timing right is one of the main things here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are planning to do a bug discovery session during the Istanbul meetup in January, maybe targeting this for late February (or later/earlier depending on what bugs we find) could be good, provided everyone is on board. It'd be nice to know that ahead of time as well so that we could block off the time and expect frontend feature work to undergo a stoppage at a specific period of time rather than just "sometime in the future". Did you have any particular alternative scheduling ideas in mind?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug bash we had in Venice was very productive so repeating it seems logical. But there are two potential issues with that:

  • We only identified and filed issues in the last meetup, fixing them afterwards.
  • I can't speak for everyone but my productivity immediately after the meetup was very low.

IMO the best time for this migration would be when all the frontend devs' productivity has gone down (less activity = easier migration) so if we could iron out the bugs before the meetup, the meetup would be the best time to migrate this because we'd all be together to collaborate in the migration process.

We do need to set down a date for this but we should only do this after the iframe removal project is deployed, and has been stable for a week or two. Other than that, we can make any time frame work.

Copy link
Collaborator

@sarayourfriend sarayourfriend Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should try to do a bug bash before the meetup then. We could even start the monorepo migration during the meetup given the core contributors will not likely be spending much time making direct frontend contributions. That way we could easily pair on any unexpected issues that crop up.

Though that would be hard if not impossible to do before the iframe project, given the current blockers and ambiguous time frame for resolving them 😢

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The period of the code freeze is a big concern for me.

Having a bug bash before it is a really great idea. It's really difficult to plan it because we probably need a sync meeting to identify bugs, [very rough estimates] a day or 2 to document all of them, and then 1-3 weeks to fix everything, especially considering the slow pace of the PR review process. We would probably only want to fix the high and critical priority bugs.

I wonder how this process be affected by the features that we plan to work on now: navigation menu improvements and the homepage. Also, if we plan to work on the addition of analytics next, would that need to be finished before the monorepo move? Or fully postponed till after the move?

Starting the monorepo migration during meetup is great for the reasons Sara mentions. However, I'm not sure that the process itself is a great use of our time at the meetup. Someone will have to be working on the migration steps during the meetup and in the week after that.


This can prove difficult given how productive our team is, so we will need to channel this productivity towards the catalog in the meantime. I can foresee the end-to-end migration taking one week (ideal scenario) to one fortnight (worst case scenario).
Copy link
Member

@zackkrida zackkrida Dec 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can foresee the end-to-end migration taking one week (ideal scenario) to one fortnight (worst case scenario).

@dhruvkb Could you unpack this time estimate a bit more for me? I am curious because I could see these tasks taking 1-2hrs or so and being treated like a deploy, where a few folks work on it together synchronously. I also think that would have benefit of minimizing the disruption to contributors. Are there long-running tasks I'm not considering?

I would also like to see some concrete implementation details about, for example, how we plan to make a PR for some of the meta/internal changes to the repo (the unified CODEOWNERS file, a new README, the modified actions, etc.). Could we have those changes prepared in a draft PR that we only merge after the frontend main branch has been combined with the API? And regarding that frontend step, do we plan to do it via PR in the GitHub UI? Details like that will give me much more confidence in our final approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The missing implementation details (such as writing new documentation, updating references) are what will take the maximum time in the 1-2 week time range I proposed for this plan.

The technical merge is a and getting the functionality working is a day or two. The remaining 3 days are for identifying these issues and fixing them. I kept 5 more days as a backup in case something unforeseen comes up but given we do sufficient planning beforehand, we will not need the second week (underpromise, overdeliver 😉).

It might be hard to make a draft PR with the changes (because of the merge conflicts) but one thing we could do is start a new branch, do the merge on it and eventually replace main with it. That seems like a great way to see everything in action but a lot of (branch-name dependent actions will be affected when we promote the other branch).

In any case, the frontend cannot be merged via the GitHub UI, it will have be CLI. In all my usage of GitHub I don't think it's possible to make a PR from any repo to another. The GitHub UI only provides merge functionality from a fork to the upstream which is not this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still really like to see a breakdown of the tasks involved in the merge, their execution order, and a time estimate for each step. We could do that when we write an implementation plan for this RFC. Two days maximum for a code freeze isn't too disruptive but definitely a concern in the case of high priority or critical bugs we might identify, even with a bug bash and small code freeze beforehand.


### Step 1: Merge with histories

This is a quick process.

1. Move the entire content of frontend inside a `frontend/` directory, except the following top-level files and folders. Please comment if you can add to this list.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the plan for making pnpm work in the monorepo? Do we want to go ahead and making pnpm live at the root of the repository and treat the frontend as a pnpm monorepo package? https://pnpm.io/workspaces

I think this would be necessary for us to be able to (reasonably) merge the browser extension and the JS that is currently in WordPress/openverse into the monorepo... otherwise we'll be fighting weird fights with pnpm in different directories 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was an issue. Currently I had everything working with pnpm with package.json moved into the frontend/ directly (no package manager at the root), but I definitely should look into the workspaces feature.


- `.github/`
- `.editorconfig`
- `justfile`
- `.pre-commit-config.yaml`
- `prettier.config.js`
- `.prettierignore`
- `.eslintrc.js` (need to update references in nested `eslintrc.js` files)
- `.eslintignore`
- <s>`.gitignore`</s> (better to move it into the `frontend/` directory and update some absolute paths)

1. Create the final commit on the `WordPress/openverse-frontend` repo. After the merge we might want to add a notice about the migration to the `README.md` file but GitHub's built-in archival process could suffice here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we definitely want to add a README notice and use the GitHub archiving functionality together. The README information will crucially link to the new repository. We should also update the repo description with the new url as well, here:

CleanShot 2022-12-02 at 13 20 13


1. Merge this repo's `main` branch into the `WordPress/openverse-api` repo's `main` branch (see Git docs for `--allow-unrelated-histories`). There will be some conflicts but they will be small and infrequent. [[implementation details](#conflict-resolution)]

1. Create "stack: \*" labels to help with issue and PR management. Spoiler/foreshadowing: these labels will be used for more things later.

1. Migrate issues from `WordPress/openverse-frontend` to `WordPress/openverse-api`. @obulat's has done prior work in this department (when we migrated from CC Search to Openverse) but that might not be as useful because in this case, we can directly transfer the issues, retaining all their comments. Apply the "stack: frontend" label to them. [[implementation details](#issue-transfer)]

With this done, we can archive the frontend repo.

#### Conflict resolution

The following conflicts may occur during merge.

- `.prettierignore`: concatenate
- `.pre-commit-config.yaml`: use from [dhruvkb/monopenverse](https://github.com/dhruvkb/monopenverse)
- Workflows can conflict but they can be renamed and kept alongside each other, _for now_.

#### Issue transfer

As far as I can tell, issue transfer can only be performed via the GitHub GraphQL API ([docs](https://docs.github.com/en/graphql/reference/mutations#transferissue)) and not via the REST API. From my limited testing, transferred issues seem to retain labels (provided they exist in the target repo).
dhruvkb marked this conversation as resolved.
Show resolved Hide resolved

An implementation of the GraphQL API call (albeit in Ruby) is available in `hub` and the [code for it](https://github.com/github/hub/commit/4c2e44146988dfb385a26f649298f274a5017756) is available in their GitHub repo for reference.

However, instead of writing the code ourselves, we can install `hub` and run a small script that repeatedly calls `hub` to migrate each issue one by one. That'll be a hack but it's okay since this is a one-off use for this.

### Step 2. Restore workflows

The workflows of both the API and frontend will need some refactoring to start, and pass, again. [monopenverse](https://github.com/dhruvkb/monopenverse) has updated these workflows and the following work.

[monopenverse](https://github.com/dhruvkb/monopenverse) showcases a `setup-env` action that sets up Node.js, Python, Just and other dependencies and can be used in every workflow.

The `ci_cd.yml` workflow from the API has been very nicely combined with the `ci.yml` workflow from the frontend. Redundant steps were eliminated.

The following actions have been successfully combined:

- actionlint ✅
- bundle_size.yml ✅
- ci_cd.yml (API) + ci.yml (frontend) [merged] ✅
- Playwright tests from ci.yml (frontend) ✅
- draft_release.yml ✅
- generate_pot.yml ✅
- gh_pages.yml ✅
- migration_safety_warning.yml ✅
- subscribe_to_label.yml ✅
- label_new_pr.yml ✅
- pr_closed.yml ✅
- pr_label_check.yml ✅
- new_issues.yml ✅
- pr_ping.yml ✅

The following have not been verified to work:

- renovate.yml
- rollback.yml
- ghcr.yml
- push_docker_image.yml

With this done, the development on the frontend can continue inside the subdirectory.

### Step 3. Buff the rough edges

There will be a few rough edges that I cannot foresee and we can continuously fix those as we spot them. But up to this point we should be in a position where
we can continue to build the API and the frontend independently but from one repo.

1. The action `banyan/auto-label` will need to be configured (`auto-label.json`) to add the "stack: \*" labels based on the modified directory.

### Step 4. Integration

This is the long term combination of code for the frontend and the API.

#### Combined lint

All lint steps can be combined in `.pre-commit-config.yaml`. This also simplifies the CI jobs can now be merged.

See the combined lint in action at [monopenverse](https://github.com/dhruvkb/monopenverse).

### Step 5. Documentation merge

The following documentation files will need reorganisation or merge.

- README.md (both repos)
- CODE_OF_CONDUCT.md (both repos)
- CONTRIBUTING.md (both repos)
- CONTRIBUTORS.md (API only; also why?)
- DOCUMENTATION_GUIDELINES.md (API only)
- TESTING_GUIDELINES.md (frontend only)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would probably want to move this into the frontend directory, I would think. It is extremely frontend specific. Or would it go into the Sphinx docs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My long-term goal is for Sphinx docs to be the single source of reference for all developers, not just API devs.

Some Markdown files that people generally look for in the repo root could be left there but ideally, even they would just point to the relevant Sphinx pages for the real stuff.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love that plan!

- DEPLOYMENT.md (frontend only)

I will need more information about this because IANAL.

- LICENSE (both repos)
Comment on lines +445 to +447
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given Openverse still uses MIT we have probably very little to worry about here, FWIW.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, but I had a vague feeling that merging repositories could have legal implications for those that use them somehow.