Skip to content
View gordonhu608's full-sized avatar

Highlights

  • Pro

Block or report gordonhu608

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,266 1,325 Updated Feb 1, 2025

Universal Actions for Enhanced Embodied Foundation Models

Python 72 3 Updated Jan 22, 2025

Official code release for ConceptGraphs

Python 510 81 Updated Jan 15, 2025

[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI

1,097 76 Updated Jan 20, 2025

PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

Python 64 1 Updated Nov 21, 2024

A curated list of papers for generalist agents

117 Updated Jan 23, 2025

Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, and external memory/knowledge augmented MLLM.

11 Updated Sep 5, 2024
Python 45 1 Updated Jan 14, 2025

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 365 16 Updated Jan 13, 2025

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,496 474 Updated Feb 12, 2025

A generative world for general-purpose robotics & embodied AI learning.

Python 23,884 2,047 Updated Feb 18, 2025

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,586 431 Updated Jan 12, 2025

Official Implementation of the paper: "Verbalized Representation Learning for Interpretable Few-Shot Generalization"

Python 6 Updated Dec 6, 2024

A paper list of some recent works about Token Compress for Vit and VLM

321 16 Updated Feb 9, 2025

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Python 129 4 Updated Dec 17, 2024

A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World

Python 214 8 Updated Nov 29, 2024

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

1,444 87 Updated Feb 14, 2025

UCLA Thesis LaTeX style

TeX 135 85 Updated Jun 15, 2020

LaTeX template files for dissertations and theses formatted according to UCLA graduate division's requirements

TeX 9 4 Updated Jul 11, 2022

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion

Python 2,028 382 Updated Dec 24, 2024

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Python 2,139 521 Updated Feb 14, 2025

[ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

Python 29 2 Updated Jan 22, 2025

Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)

Python 170 8 Updated Jan 14, 2025

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,212 51 Updated Feb 10, 2025

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Python 202 13 Updated Sep 16, 2024

Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"

Python 542 25 Updated Aug 16, 2024

Official implementation for FlexAttention for Efficient High-Resolution Vision-Language Models

Python 36 5 Updated Jan 8, 2025

A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.

Python 498 42 Updated Feb 7, 2025
Next
Showing results