Skip to content
Change the repository type filter

All

    Repositories list

    • MMFuser

      Public
      The official implementation of the paper "MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding". MMFuser addresses the limitations of current MLLMs in capturing complex image details by simply yet efficiently integrating multi-layer features from ViTs.
      Python
      Apache License 2.0
      4000Updated Oct 16, 2024Oct 16, 2024
    • The jetson-examples repository by Seeed Studio offers a seamless, one-line command deployment to run vision AI and Generative AI models on the NVIDIA Jetson platform.
      Shell
      MIT License
      16000Updated Oct 12, 2024Oct 12, 2024
    • The first open-source synthetic dataset for collaborative perception focused on adverse weather conditions
      Python
      MIT License
      1000Updated Oct 10, 2024Oct 10, 2024
    • t2v-turbo

      Public
      Code repository for T2V-Turbo and T2V-Turbo-v2
      Python
      18000Updated Oct 9, 2024Oct 9, 2024
    • OccRWKV

      Public
      OccRWKV: Rethinking Efficient 3D Semantic Occupancy Prediction with Linear Complexity
      Python
      3000Updated Oct 1, 2024Oct 1, 2024
    • VLAD-BuFF

      Public
      VLAD-BuFF: Burst-aware Fast Feature Aggregation for Visual Place Recognition (ECCV 2024)
      Python
      GNU General Public License v3.0
      3000Updated Oct 1, 2024Oct 1, 2024
    • [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"
      Python
      Other
      1000Updated Oct 1, 2024Oct 1, 2024
    • Python
      159000Updated Sep 28, 2024Sep 28, 2024
    • SpaceJAM

      Public
      SpaceJAM: a Lightweight and Regularization-free Method for Fast Joint Alignment of Images (ECCV 2024)
      Python
      MIT License
      1000Updated Sep 24, 2024Sep 24, 2024
    • SimMAT

      Public
      Python
      2000Updated Sep 13, 2024Sep 13, 2024
    • XNetv2

      Public
      [BIBM 2024] XNet v2: Fewer Limitations, Better Results and Greater Universality
      Python
      1000Updated Aug 28, 2024Aug 28, 2024
    • VP-LLR

      Public
      Code repo for "When Does Visual Prompting Outperform Linear Probing? A Likelihood Perspective"
      Python
      Apache License 2.0
      1000Updated Aug 27, 2024Aug 27, 2024
    • Self-Supervised Scalable Deep Compressed Sensing (IJCV 2024) [PyTorch]
      Python
      4000Updated Aug 18, 2024Aug 18, 2024
    • A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
      Python
      Apache License 2.0
      104000Updated Aug 12, 2024Aug 12, 2024
    • Python
      4000Updated Aug 9, 2024Aug 9, 2024
    • APGCC

      Public
      ECCV24 - Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance
      Python
      MIT License
      9000Updated Jul 26, 2024Jul 26, 2024
    • ABAFnet

      Public
      Attention-Based Acoustic Feature Fusion Network for Depression Detection
      Python
      GNU General Public License v3.0
      1000Updated Jul 22, 2024Jul 22, 2024
    • mgc

      Public
      The official implementation of paper: "Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning"
      Python
      Other
      3000Updated Jul 17, 2024Jul 17, 2024
    • RWKV-CLIP

      Public
      The official code of "RWKV-CLIP: A Robust Vision-Language Representation Learner"
      Python
      MIT License
      8000Updated Jul 12, 2024Jul 12, 2024
    • VisualRWKV is the visual-enhanced version of the RWKV language model, enabling RWKV to handle various visual tasks.
      Python
      Apache License 2.0
      14000Updated Jul 7, 2024Jul 7, 2024
    • Jupyter Notebook
      MIT License
      3000Updated Jul 4, 2024Jul 4, 2024
    • Training and Tuning Strategies for Foundation Models in Medical Imaging
      Jupyter Notebook
      6000Updated Jun 28, 2024Jun 28, 2024
    • StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.
      Python
      MIT License
      71000Updated Jun 27, 2024Jun 27, 2024
    • Python
      1000Updated Jun 26, 2024Jun 26, 2024
    • BasicPBC

      Public
      Official Implementation of "Learning Inclusion Matching for Animation Paint Bucket Colorization"
      Python
      Other
      23000Updated Jun 25, 2024Jun 25, 2024
    • Shadow_R

      Public
      This is the official PyTorch implementation of ShadowRefiner. Our method is winner of Perceptual Track and achieves the second-best performance for Fidelity Track in NTIRE 2024 Shadow Removal Challenge (CVPR 2024 Workshop)
      Python
      MIT License
      3000Updated Jun 19, 2024Jun 19, 2024
    • E2STR

      Public
      The official code for the CVPR 2024 paper: Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer
      Python
      Apache License 2.0
      4000Updated Jun 14, 2024Jun 14, 2024
    • MPCount

      Public
      Official repo for CVPR2024 paper "Single Domain Generalization for Crowd Counting"
      Python
      Apache License 2.0
      5000Updated Jun 13, 2024Jun 13, 2024
    • PIIP

      Public
      Parameter-Inverted Image Pyramid Networks (PIIP)
      Python
      MIT License
      2000Updated Jun 11, 2024Jun 11, 2024
    • Python
      Apache License 2.0
      3000Updated May 27, 2024May 27, 2024