Skip to content

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Apr 11, 2025

Bumps transformers from 4.40.0 to 4.48.0.

Release notes

Sourced from transformers's releases.

v4.48.0: ModernBERT, Aria, TimmWrapper, ColPali, Falcon3, Bamba, VitPose, DinoV2 w/ Registers, Emu3, Cohere v2, TextNet, DiffLlama, PixtralLarge, Moonshine

New models

ModernBERT

The ModernBert model was proposed in Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference by Benjamin Warner, Antoine Chaffin, Benjamin Clavié, Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Galalgher, Raja Bisas, Faisal Ladhak, Tom Aarsen, Nathan Cooper, Grifin Adams, Jeremy Howard and Iacopo Poli.

It is a refresh of the traditional encoder architecture, as used in previous models such as BERT and RoBERTa.

It builds on BERT and implements many modern architectural improvements which have been developed since its original release, such as:

  • Rotary Positional Embeddings to support sequences of up to 8192 tokens.
  • Unpadding to ensure no compute is wasted on padding tokens, speeding up processing time for batches with mixed-length sequences.
  • GeGLU Replacing the original MLP layers with GeGLU layers, shown to improve performance.
  • Alternating Attention where most attention layers employ a sliding window of 128 tokens, with Global Attention only used every 3 layers.
  • Flash Attention to speed up processing.
  • A model designed following recent The Case for Co-Designing Model Architectures with Hardware, ensuring maximum efficiency across inference GPUs.
  • Modern training data scales (2 trillion tokens) and mixtures (including code ande math data)

image

Aria

The Aria model was proposed in Aria: An Open Multimodal Native Mixture-of-Experts Model by Li et al. from the Rhymes.AI team.

Aria is an open multimodal-native model with best-in-class performance across a wide range of multimodal, language, and coding tasks. It has a Mixture-of-Experts architecture, with respectively 3.9B and 3.5B activated parameters per visual token and text token.

TimmWrapper

We add a TimmWrapper set of classes such that timm models can be loaded in as transformer models into the library.

Here's a general usage example:

import torch
from urllib.request import urlopen
from PIL import Image
from transformers import AutoConfig, AutoModelForImageClassification, AutoImageProcessor
checkpoint = "timm/resnet50.a1_in1k"
img = Image.open(urlopen(
'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/beignets-task-guide.png'
))
image_processor = AutoImageProcessor.from_pretrained(checkpoint)
</tr></table>

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps [transformers](https://github.com/huggingface/transformers) from 4.40.0 to 4.48.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](huggingface/transformers@v4.40.0...v4.48.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-version: 4.48.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Apr 11, 2025
@dependabot @github
Copy link
Author

dependabot bot commented on behalf of github Apr 29, 2025

Superseded by #3.

@dependabot dependabot bot closed this Apr 29, 2025
@dependabot dependabot bot deleted the dependabot/pip/backends/python/server/transformers-4.48.0 branch April 29, 2025 16:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants