token-pruning

Here are 13 public repositories matching this topic...

ModelTC / LightCompress

A powerful toolkit for compressing large models including LLM, VLM, and video generation models.

benchmark deployment tool evaluation pruning quantization wan awq large-language-models llm token-pruning vllm smoothquant token-reduction mixtral internlm2 token-merging deepseek-v3

Updated Aug 22, 2025
Python

xuyang-liu16 / Awesome-Token-level-Model-Compression

Star

📚 Collection of token-level model compression resources.

computer-vision model-compression model-acceleration efficient-deep-learning token-pruning token-merging token-compression

Updated Aug 26, 2025

microsoft / Moonlit

Star

This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.

model-compression neural-architecture-search inference-efficiency token-pruning

Updated Oct 25, 2024
Python

mlvlab / vid-TLDR

Star

Official implementation of CVPR 2024 paper "vid-TLDR: Training Free Token merging for Light-weight Video Transformer".

computer-vision video-transformer token-pruning efficient-vision-transformers cvpr2024 token-merging

Updated May 7, 2024
Python

vbdi / divprune

Star

[CVPR 2025] DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models

pruning inference-optimization multi-modality llm token-pruning vision-language-model llava multimodal-large-language-models

Updated May 27, 2025
Python

cokeshao / HoliTom

Star

HoliTom: Holistic Token Merging for Fast Video Large Language Models

large-language-models token-pruning llava multimodal-large-language-models video-large-language-models token-merging llava-next-video visionzip

Updated Aug 17, 2025
Python

benbergner / cropr

Star

A token pruning method that accelerates ViTs for various tasks while maintaining high performance.

computer-vision deep-learning neural-networks attention-mechanism vision-transformer token-pruning

Updated Jul 21, 2025
Python

mlvlab / Representation-Shift

Star

Official Implementation (Pytorch) of the "Representation Shift: Unifying Token Compression with FlashAttention", ICCV 2025

token-pruning efficient-vision-transformers iccv2025

Updated Jul 30, 2025

Adam-Mazur / Lazy-Llama

Star

An implementation of LazyLLM token pruning for LLaMa 2 model family.

transformers llama huggingface huggingface-transformers token-pruning llama2

Updated Jan 6, 2025
Python

MILVLG / twigvlm

Star

Implementation of ICCV 2025 paper "Growing a Twig to Accelerate Large Vision-Language Models".

pytorch inference-acceleration token-pruning vision-language-models

Updated Jul 29, 2025
Python

sangminwoo / awesome-token-redundancy-reduction

Star

😎 Awesome papers on token redundancy reduction

token-pruning token-reduction token-merging token-compression token-sparsification token-redundancy-reduction

Updated Mar 12, 2025

Jungmin-YUN-0 / Attention_Lightweight

Star

lightweight pytorch transformer classification self-attention token-pruning

Updated Jul 24, 2023
Python

ahmadpanah / TS-DTP

Star

Task-Specific Dynamic Token Pruning (TS-DTP) for LLMs

artificial-intelligence token-pruning large-language-model

Updated Jan 15, 2025
Jupyter Notebook

Improve this page

Add a description, image, and links to the token-pruning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the token-pruning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

token-pruning

Here are 13 public repositories matching this topic...

ModelTC / LightCompress

xuyang-liu16 / Awesome-Token-level-Model-Compression

microsoft / Moonlit

mlvlab / vid-TLDR

vbdi / divprune

cokeshao / HoliTom

benbergner / cropr

mlvlab / Representation-Shift

Adam-Mazur / Lazy-Llama

MILVLG / twigvlm

sangminwoo / awesome-token-redundancy-reduction

Jungmin-YUN-0 / Attention_Lightweight

ahmadpanah / TS-DTP

Improve this page

Add this topic to your repo