Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the URL cleanup process from the ingestion server #700

Closed
obulat opened this issue May 20, 2022 · 1 comment · Fixed by #4727
Closed

Remove the URL cleanup process from the ingestion server #700

obulat opened this issue May 20, 2022 · 1 comment · Fixed by #4727
Assignees
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: ingestion server Related to the ingestion/data refresh server

Comments

@obulat
Copy link
Contributor

obulat commented May 20, 2022

Problem

Currently, we are doing some URL cleanup during the ingestion, which duplicates the work done in the catalog.

Description

We should remove the URL cleanup step from the ingestion server.

@obulat obulat added 🟨 priority: medium Not blocking but should be addressed soon ✨ goal: improvement Improvement to an existing user-facing feature 💻 aspect: code Concerns the software code in the repository ⛔ status: blocked Blocked & therefore, not ready for work data normalization labels May 20, 2022
@obulat obulat mentioned this issue May 20, 2022
29 tasks
@obulat obulat changed the title Remove the tags cleanup process from the ingestion server Remove the tags and URL cleanup process from the ingestion server Aug 1, 2022
@obulat obulat transferred this issue from WordPress/openverse-api Feb 22, 2023
@obulat obulat removed their assignment Feb 24, 2023
@krysal krysal added 🧱 stack: ingestion server Related to the ingestion/data refresh server and removed 🧱 stack: backend labels Mar 6, 2023
@dhruvkb dhruvkb added this to the Data normalization milestone Dec 2, 2023
@obulat
Copy link
Contributor Author

obulat commented Apr 17, 2024

Adding here that we should also run a tag cleanup that would remove the incorrectly encoded tags to fix #1303 before removing the tags clean up process.
This issue was mixing the tags and the URLs. It now only refers to the URL cleanup.

@krysal krysal changed the title Remove the tags and URL cleanup process from the ingestion server Remove the URL cleanup process from the ingestion server Jun 14, 2024
@krysal krysal removed the ⛔ status: blocked Blocked & therefore, not ready for work label Aug 7, 2024
@zackkrida zackkrida self-assigned this Aug 7, 2024
@zackkrida zackkrida added 🟧 priority: high Stalls work on the project or its dependents and removed 🟨 priority: medium Not blocking but should be addressed soon labels Aug 7, 2024
@obulat obulat self-assigned this Aug 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository ✨ goal: improvement Improvement to an existing user-facing feature 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: ingestion server Related to the ingestion/data refresh server
Projects
Archived in project
4 participants