Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(deps): update dependency torchaudio to v2.4.1 #143

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

renovate[bot]
Copy link
Contributor

@renovate renovate bot commented May 8, 2023

This PR contains the following updates:

Package Change Age Adoption Passing Confidence
torchaudio 2.0.1 -> 2.4.1 age adoption passing confidence

Release Notes

pytorch/audio (torchaudio)

v2.4.1: TorchAudio 2.4.1 Release

Compare Source

This release is compatible with PyTorch 2.4.1 patch release. There are no new features added.

v2.4.0: TorchAudio 2.4.0 Release

Compare Source

This release is compatible with PyTorch 2.4. There are no new features added.

This release contains 2 fixes:

v2.3.1: TorchAudio 2.3.1 Release

Compare Source

This release is compatible with PyTorch 2.3.1 patch release. There are no new features added.

v2.3.0: TorchAudio 2.3.0 Release

Compare Source

This release is compatible with PyTorch 2.3.0 patch release. There are no new features added.

This release contains minor documentation and code quality improvements (#​3734, #​3748, #​3757, #​3759)

v2.2.2: TorchAudio 2.2.2 Release

Compare Source

This release is compatible with PyTorch 2.2.2 patch release. There are no new features added.

v2.2.1: TorchAudio 2.2.1 Release

Compare Source

This release is compatible with PyTorch 2.2.1 patch release. There are no new features added.

v2.2.0: TorchAudio 2.2.0 Release

Compare Source

New Features

Bug Fixes

Recipe Updates

v2.1.2: TorchAudio 2.1.2 Release

Compare Source

This is a patch release, which is compatible with PyTorch 2.1.2. There are no new features added.

v2.1.1

Compare Source

This is a minor release, which is compatible with PyTorch 2.1.1 and includes bug fixes, improvements and documentation updates.

Bug Fixes

  • Cherry-pick 2.1.1: Fix WavLM bundles (#​3665)
  • Cherry-pick 2.1.1: Add back compression level in i/o dispatcher backend by (#​3666)

v2.1.0: Torchaudio 2.1 Release Note

Compare Source

Hilights

TorchAudio v2.1 introduces the new features and backward-incompatible changes;

  1. [BETA] A new API to apply filter, effects and codec
    torchaudio.io.AudioEffector can apply filters, effects and encodings to waveforms in online/offline fashion.
    You can use it as a form of augmentation.
    Please refer to https://pytorch.org/audio/2.1/tutorials/effector_tutorial.html for the examples.
  2. [BETA] Tools for forced alignment
    New functions and a pre-trained model for forced alignment were added.
    torchaudio.functional.forced_align computes alignment from an emission and torchaudio.pipelines.MMS_FA provides access to the model trained for multilingual forced alignment in MMS: Scaling Speech Technology to 1000+ languages project.
    Please refer to https://pytorch.org/audio/2.1/tutorials/ctc_forced_alignment_api_tutorial.html for the usage of forced_align function, and https://pytorch.org/audio/2.1/tutorials/forced_alignment_for_multilingual_data_tutorial.html for how one can use MMS_FA to align transcript in multiple languages.
  3. [BETA] TorchAudio-Squim : Models for reference-free speech assessment
    Model architectures and pre-trained models from the paper TorchAudio-Squim: Reference-less Speech Quality and Intelligibility measures in TorchAudio were added.
    You can use torchaudio.pipelines.SQUIM_SUBJECTIVE and torchaudio.pipelines.SQUIM_OBJECTIVE models to estimate the various speech quality and intelligibility metrics. This is helpful when evaluating the quality of speech generation models, such as TTS.
    Please refer to https://pytorch.org/audio/2.1/tutorials/squim_tutorial.html for the detail.
  4. [BETA] CUDA-based CTC decoder
    torchaudio.models.decoder.CUCTCDecoder takes emission stored in CUDA memory and performs CTC beam search on it in CUDA device. The beam search is fast. It eliminates the need to move data from CUDA device to CPU when performing automatic speech recognition. With PyTorch's CUDA support, it is now possible to perform the entire speech recognition pipeline in CUDA.
    Please refer to https://pytorch.org/audio/2.1/tutorials/asr_inference_with_cuda_ctc_decoder_tutorial.html for the detail.
  5. [Prototype] Utilities for AI music generation
    We are working to add utilities that are relevant to music AI. Since the last release, the following APIs were added to the prototype.
    Please refer to respective documentation for the usage.
    • torchaudio.prototype.chroma_filterbank
    • torchaudio.prototype.transforms.ChromaScale
    • torchaudio.prototype.transforms.ChromaSpectrogram
    • torchaudio.prototype.pipelines.VGGISH
  6. New recipes for training models.
    Recipes for Audio-visual ASR, multi-channel DNN beamforming and TCPGen context-biasing were added.
    Please refer to the recipes
  7. Update to FFmpeg support
    The version of supported FFmpeg libraries was updated.
    TorchAudio v2.1 works with FFmpeg 6, 5 and 4.4. The support for 4.3, 4.2 and 4.1 are dropped.
    Please refer to https://pytorch.org/audio/2.1/installation.html#optional-dependencies for the detail of the new FFmpeg integration mechanism.
  8. Update to libsox integration
    TorchAudio now depends on libsox installed separately from torchaudio. Sox I/O backend no longer supports file-like object. (This is supported by FFmpeg backend and soundfile)
    Please refer to https://pytorch.org/audio/2.1/installation.html#optional-dependencies for the detail.

New Features

I/O
  • Support overwriting PTS in torchaudio.io.StreamWriter (#​3135)
  • Include format information after filter torchaudio.io.StreamReader.get_out_stream_info (#​3155)
  • Support CUDA frame in torchaudio.io.StreamReader filter graph (#​3183, #​3479)
  • Support YUV444P in GPU decoder (#​3199)
  • Add additional filter graph processing to torchaudio.io.StreamWriter (#​3194)
  • Cache and reuse HW device context in GPU decoder (#​3178)
  • Cache and reuse HW device context in GPU encoder (#​3215)
  • Support changing the number of channels in torchaudio.io.StreamReader (#​3216)
  • Support encode spec change in torchaudio.io.StreamWriter (#​3207)
  • Support encode options such as compression rate and bit rate (#​3179, #​3203, #​3224)
  • Add 420p10le support to torchaudio.io.StreamReader CPU decoder (#​3332)
  • Support multiple FFmpeg versions (#​3464, #​3476)
  • Support writing opus and mp3 with soundfile (#​3554)
  • Add switch to disable sox integration and ffmpeg integration at runtime (#​3500)
Ops
Models
  • Add torchaudio.models.SquimObjective for speech enhancement (#​3042, 3087, #​3512)
  • Add torchaudio.models.SquimSubjective for speech enhancement (#​3189)
  • Add torchaudio.models.decoder.CUCTCDecoder (#​3096)
Pipelines
  • Add torchaudio.pipelines.SquimObjectiveBundle for speech enhancement (#​3103)
  • Add torchaudio.pipelines.SquimSubjectiveBundle for speech enhancement (#​3197)
  • Add torchaudio.pipelines.MMS_FA Bundle for forced alignment (#​3521, #​3538)
Tutorials
Recipe

Backward-incompatible changes

Third-party libraries

In this release, the following third party libraries are removed from TorchAudio binary distributions. TorchAudio now search and link these libraries at runtime. Please install them to use the corresponding APIs.

SoX

libsox is used for various audio I/O, filtering operations.

Pre-built binaries are avaialble via package managers, such as conda, apt and brew. Please refer to the respective documetation.

The APIs affected include;

  • torchaudio.load ("sox" backend)
  • torchaudio.info ("sox" backend)
  • torchaudio.save ("sox" backend)
  • torchaudio.sox_effects.apply_effects_tensor
  • torchaudio.sox_effects.apply_effects_file
  • torchaudio.functional.apply_codec (also deprecated, see below)

Changes related to the removal: #​3232, #​3246, #​3497, #​3035

Flashlight Text

flashlight-text is the core of CTC decoder.

Pre-built packages are available on PyPI. Please refer to https://github.com/flashlight/text for the detail.

The APIs affected include;

  • torchaudio.models.decoder.CTCDecoder

Changes related to the removal: #​3232, #​3246, #​3236, #​3339

Kaldi

A custom built libkaldi was used to implement torchaudio.functional.compute_kaldi_pitch. This function, along with libkaldi integration, is removed in this release. There is no replcement.

Changes related to the removal: #​3368, #​3403

I/O
  • Switch to the backend dispatcher (#​3241)

To make I/O operations more flexible, TorchAudio introduced the backend dispatcher in v2.0, and users could opt-in to use the dispatcher.
In this release, the backend dispatcher becomes the default mechanism for selecting the I/O backend.

You can pass backend argument to torchaudio.info, torchaudio.load and torchaudio.save function to select I/O backend library per-call basis. (If it is omitted, an available backend is automatically selected.)

If you want to use the global backend mechanism, you can set the environment variable, TORCHAUDIO_USE_BACKEND_DISPATCHER=0.
Please note, however, that this the global backend mechanism is deprecated and is going to be removed in the next release.

Please see #​2950 for the detail of migration work.

torchaudio.io.StreamReader accepted a byte-string wrapped in 1D torch.Tensor object. This is no longer supported.
Please wrap the underlying data with io.BytesIO instead.

The optional arguments of add_[audio|video]_stream methods of torchaudio.io.StreamReader and torchaudio.io.StreamWriter are now keyword-only arguments.

  • Drop the support of FFmpeg < 4.1 (#​3561, 3557)

Previously TorchAudio supported FFmpeg 4 (>=4.1, <=4.4). In this release, TorchAudio supports FFmpeg 4, 5 and 6 (>=4.4, <7). With this change, support for FFmpeg 4.1, 4.2 and 4.3 are dropped.

Ops
  • Use named file in torchaudio.functional.apply_codec (#​3397)

In previous versions, TorchAudio shipped custom built libsox, so that it can perform in-memory decoding and encoding.
Now, in-memory decoding and encoding are handled by FFmpeg binding, and with the switch to dynamic libsox linking, torchaudio.functional.apply_codec no longer process audio in in-memory fashion. Instead it writes to temporary file.
For in-memory processing, please use torchaudio.io.AudioEffector.

  • Switch to lstsq when solving InverseMelScale (#​3280)

Previously, torchaudio.transform.InverseMelScale ran SGD optimizer to find the inverse of mel-scale transform. This approach has number of issues as listed in #​2643.

This release switches to use torch.linalg.lstsq.

Models

The infer method of torchaudio.models.RNNTBeamSearch has been updated to accept series of previous hypotheses.

bundle = torchaudio.pipelines.EMFORMER_RNNT_BASE_LIBRISPEECH
decoder: RNNTBeamSearch = bundle.get_decoder()

hypothesis = None
while streaming:
    ...
    hypo, state = decoder.infer(
        features,
        length,
        beam_width,
        state=state,
        hypothesis=hypothesis,
    )
    ...
    hypothesis = hypo

### Previously this had to be hypothesis = hypo[0]

Deprecations

Ops
  • Update and deprecate torchaudio.functional.apply_codec function (#​3386)

Due to the removal of custom libsox binding, torchaudio.functional.apply_codec no longer supports in-memory processing. Please migrate to torchaudio.io.AudioEffector.

Please refer to for the detailed usage of torchaudio.io.AudioEffector.

Bug Fixes

Models
  • Fix the negative sampling in ConformerWav2Vec2PretrainModel (#​3085)
  • Fix extract_features method for WavLM models (#​3350)
Tutorials
  • Fix backtracking in forced alignment tutorial (#​3440)
  • Fix initialization of get_trellis in forced alignment tutorial (#​3172)
Build
  • Fix MKL issue on Intel mac build (#​3307)
I/O
  • Surpress warning when saving vorbis with sox backend (#​3359)
  • Fix g722 encoding in torchaudio.io.StreamWriter (#​3373)
  • Refactor arg mapping in ffmpeg save function (#​3387)
  • Fix save INT16 sox backend (#​3524)
  • Fix SoundfileBackend method decorators (#​3550)
  • Fix PTS initialization when using NVIDIA encoder (#​3312)
Ops
  • Add non-default CUDA device support to lfilter (#​3432)

Improvements

I/O
Ops
  • Add arbitrary dim Tensor support to mask_along_axis{,_iid} (#​3289)
  • Fix resampling to support dynamic input lengths for onnx exports. (#​3473)
  • Optimize Torchaudio Vad (#​3382)
Documentation
  • Build and use GPU-enabled FFmpeg in doc CI (#​3045)
  • Misc tutorial update (#​3449)
  • Update notes on FFmpeg version (#​3480)
  • Update documentation about dependencies (#​3517)
  • Update I/O and backend docs (#​3555)
Tutorials
  • Update data augmentation tutorial (#​3375)
  • Add more explanation about n_fft (#​3442)
Build
Recipe
  • Fix Adam and AdamW initializers in wav2letter example (#​3145)
  • Update Librispeech RNNT recipe to support Lightening 2.0 (#​3336)
  • Update HuBERT/SSL training recipes to support Lightning 2.x (#​3396)
  • Add wav2vec2 loss function in self_supervised_learning training recipe (#​3090)
  • Add Wav2Vec2DataModule in self_supervised_learning training recipe (#​3081)
Other
  • Use FFmpeg6 in build doc (#​3475)
  • Use FFmpeg6 in unit test (#​3570)
  • Migrate torch.norm to torch.linalg.vector_norm (#​3522)
  • Migrate torch.nn.utils.weight_norm to nn.utils.parametrizations.weight_norm (#​3523)

v2.0.2

Compare Source

TorchAudio 2.0.2 Release Note

This is a minor release, which is compatible with PyTorch 2.0.1 and includes bug fixes, improvements and documentation updates. There is no new feature added.

Bug fix

Full Changelog: pytorch/audio@v2.0.1...v2.0.2


Configuration

📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 Ignore: Close this PR and you won't be reminded about this update again.


  • If you want to rebase/retry this PR, check this box

This PR was generated by Mend Renovate. View the repository job log.

@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.0.2 fix(deps): update dependency torchaudio to v2.1.0 Oct 4, 2023
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 99e404a to 4c7362a Compare October 4, 2023 20:04
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 4c7362a to 302e17e Compare October 15, 2023 16:47
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.1.0 fix(deps): update dependency torchaudio to v2.1.1 Nov 15, 2023
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 302e17e to 444a4c1 Compare November 15, 2023 19:26
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 444a4c1 to bba3d8c Compare November 30, 2023 12:51
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from bba3d8c to 9d2f671 Compare December 14, 2023 23:17
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.1.1 fix(deps): update dependency torchaudio to v2.1.2 Dec 14, 2023
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 9d2f671 to 28469ee Compare December 19, 2023 13:10
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 28469ee to a375124 Compare January 4, 2024 15:45
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.1.2 fix(deps): update dependency torchaudio to v2.2.0 Jan 30, 2024
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from a375124 to 66dcca5 Compare February 22, 2024 23:17
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.2.0 fix(deps): update dependency torchaudio to v2.2.1 Feb 22, 2024
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 66dcca5 to c29e1d5 Compare February 25, 2024 11:33
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from c29e1d5 to a1aee1d Compare March 29, 2024 17:42
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.2.1 fix(deps): update dependency torchaudio to v2.2.2 Mar 29, 2024
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from a1aee1d to 79c2141 Compare April 24, 2024 18:40
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.2.2 fix(deps): update dependency torchaudio to v2.3.0 Apr 24, 2024
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 79c2141 to 00ff27c Compare June 5, 2024 18:17
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.3.0 fix(deps): update dependency torchaudio to v2.3.1 Jun 5, 2024
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 00ff27c to 900084f Compare July 24, 2024 19:13
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.3.1 fix(deps): update dependency torchaudio to v2.4.0 Jul 24, 2024
@renovate renovate bot force-pushed the renovate/torchaudio-2.x-lockfile branch from 900084f to edb5043 Compare September 4, 2024 21:41
@renovate renovate bot changed the title fix(deps): update dependency torchaudio to v2.4.0 fix(deps): update dependency torchaudio to v2.4.1 Sep 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants