-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a separate service to generate audio waveforms #731
Comments
I imagine it'll be a very interesting exercise to write a thin API wrapper (using something fast and close to the metal like Go) over the BBC |
Oh interesting idea Dhruv. Do you mean basically writing a wrapper around However, I have to say that might be over-complicating it when you could just memoize the calls to the Then again, all of that might be over-complicating it when storing it in the database could be the simplest solution and provide sufficient performance anyway. My vote would probably be to go the simple database route, measure the performance, and if we see some noticeable peaks in the 95P+ range then look into improving it. That being said, there are probably other parts of our stack, especially in the API, that could use that same kind of analysis and I'm eager for us to get some monitoring in place that will allow us to do that. |
Can we close this issue now that #1551 has been merged (and deployed 🚀), or is this a more broad issue, @sarayourfriend ? |
I think this is still an issue that needs to be directly addressed. Relying on manually running a django command to "warm the cache" of waveforms, as it were, is not a sustainable (or desirable) solution in the long term. |
As discussed recently, I'm leaning towards the following:
So basically, I don't personally think we should warm the cache at all. To revisit the original problems:
With my discussed approach, we would remove any waveform data from the DB and instead treat them like we do image thumbnails, where the API response includes a reference to the waveform data
This waveform generator would become a standalone microservice. I'm open to closing this issue, I don't think it explicitly relates to the catalog anymore. |
Let's move the issue to either openverse or openverse-api, wherever we thing it'd make the most sense to record the need for an entirely new service. |
Moved the issue to the API, since that's where the thumbnail service currently resides. |
@krysal I'm going to move this out of the |
As part of the #2843 discussion, we decided not to rely on microservices for these things, instead to use async Python to ensure waveform generation doesn't block workers. There's still some argument to be made that waveform generation is CPU intensive, in a way that could interrupt the overall stability of a single task, but that is not anything we've seen happen thus far. I'm closing this issue as won't do for now. If we see waveforms become a performance issue, we can explore ways of improving it, whether via a separate service or some other approach. In the meantime, however, it isn't worth considering this as work we need to think about. |
Problem
Currently the audio waveforms are created upon request in the API. This has two effects:
audiowaveform
which could present an unnecessary vulnerabilityDescription
From @AetherUnbound in a private chat:
Alternatives
Continue just creating the waveforms in the API and accept the two issues in the description.
Implementation
The text was updated successfully, but these errors were encountered: