-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add speech-to-text to weekly dependency updates #65
Open
jmartin-sul opened this issue
Dec 10, 2024
· 0 comments
· May be fixed by sul-dlss/access-update-scripts#293
Open
add speech-to-text to weekly dependency updates #65
jmartin-sul opened this issue
Dec 10, 2024
· 0 comments
· May be fixed by sul-dlss/access-update-scripts#293
Labels
Comments
This was referenced Dec 10, 2024
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Jan 31, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Feb 3, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
edsu
added a commit
that referenced
this issue
Feb 3, 2025
We've been installing Python dependencies with Pip, and not tracking their versions. Since we've started using uv in some other infrastructure team Python projects it makes sense to add here so that speech-to-text can be tracked by infra-team's weekly dependency update process. Unlike pip, uv always installs system specific Python wheels. There wasn't a wheel available for triton (an openai-whisper's dependency) under Python3.8 so the installation of dependencies failed. So, in addition to adding uv this PR also upgrades our base Docker image from `nvidia/cuda:12.1.0-devel-ubuntu20.04` to `nvidia/cuda:12.8.0-cudnn-devel-ubuntu22.04` which allows us to install python3.10 when we `apt install python3`. Since this significantly change Whisper's behavior I wanted to be able to compare the VTT transcript output before and after the Docker image change. I added the start of a benchmarking system that will allow us to compare the output of a set of 22 SDR items, with a previous benchmark. Ideally this benchmark would be human vetted, and actually represent a ground truth for what we believe the transcript should be. But for the time being it is simply a snapshot in time of what the transcript looked like today. See the benchmark/README.md file for details. Closes #80 Refs #65
jmartin-sul
added a commit
to sul-dlss/access-update-scripts
that referenced
this issue
Feb 7, 2025
it's a python project, but the update hook was already added in sul-dlss/speech-to-text#81 closes sul-dlss/speech-to-text#65
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Here's an example from another Python project: https://github.com/sul-dlss/was-pywb/blob/main/.autoupdate/preupdate
That project uses Poetry, but rialto-airflow uses uv to manage dependencies, and we may want to consider adapting that example to uv. The author of this ticket is unsure of the pros/cons offhand, so part of the work for someone who isn't already familiar with both would be to ask for input from people who worked on rialto-airflow what they'd recommend, since that's the more recent of the two projects.
The text was updated successfully, but these errors were encountered: