gordonhu608

Follow

Wenbo Hu gordonhu608

Follow

Master CS student @ucla

20 followers · 39 following

Achievements

Achievements

Highlights

Pro

Stars

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 10,266 1,325 Updated Feb 1, 2025

2toinf / UniAct

Universal Actions for Enhanced Embodied Foundation Models

Python 72 3 Updated Jan 22, 2025

concept-graphs / concept-graphs

Official code release for ConceptGraphs

Python 510 81 Updated Jan 15, 2025

deepseek-ai / DeepSeek-R1

77,359 10,027 Updated Feb 14, 2025

HCPLab-SYSU / Embodied_AI_Paper_List

[Embodied-AI-Survey-2024] Paper list and projects for Embodied AI

1,097 76 Updated Jan 20, 2025

allenai / PoliFormer

PoliFormer: Scaling On-Policy RL with Transformers Results in Masterful Navigators

Python 64 1 Updated Nov 21, 2024

cheryyunl / awesome-generalist-agents

A curated list of papers for generalist agents

117 Updated Jan 23, 2025

patrick-tssn / Awesome-Multimodal-Memory

Reading List of Memory Augmented Multimodal Research, including multimodal context modeling, memory in vision and robotics, and external memory/knowledge augmented MLLM.

11 Updated Sep 5, 2024

SalesforceAIResearch / TACO

Python 45 1 Updated Jan 14, 2025

ictnlp / LLaVA-Mini

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 365 16 Updated Jan 13, 2025

NVIDIA / Cosmos

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,496 474 Updated Feb 12, 2025

zhangyuejoslin / VLN-Survey-with-Foundation-Models

44 3 Updated Jan 8, 2025

Genesis-Embodied-AI / Genesis

A generative world for general-purpose robotics & embodied AI learning.

Python 23,884 2,047 Updated Feb 18, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,586 431 Updated Jan 12, 2025

joeyy5588 / VRL

Official Implementation of the paper: "Verbalized Representation Learning for Interpretable Few-Shot Generalization"

Python 6 Updated Dec 6, 2024

daixiangzi / Awesome-Token-Compress

A paper list of some recent works about Token Compress for Vit and VLM

321 16 Updated Feb 9, 2025

dongyh20 / Insight-V

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

Python 129 4 Updated Dec 17, 2024

ZCMax / LLaVA-3D

A Simple yet Effective Pathway to Empowering LLaVA to Understand and Interact with 3D World

Python 214 8 Updated Nov 29, 2024

ActiveVisionLab / Awesome-LLM-3D

Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources

1,444 87 Updated Feb 14, 2025

uclathes / uclathes

UCLA Thesis LaTeX style

TeX 135 85 Updated Jun 15, 2020

parthe / UCLA_dissertations_latex

LaTeX template files for dissertations and theses formatted according to UCLA graduate division's requirements

TeX 9 4 Updated Jul 11, 2022

real-stanford / diffusion_policy

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion

Python 2,028 382 Updated Dec 24, 2024

facebookresearch / habitat-lab

A modular high-level library to train embodied AI agents across a variety of tasks and environments.

Python 2,139 521 Updated Feb 14, 2025

mragbench / MRAG-Bench

[ICLR 2025] Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

Python 29 2 Updated Jan 22, 2025

embodied-agent-interface / embodied-agent-interface

Embodied Agent Interface (EAI): Benchmarking LLMs for Embodied Decision Making (NeurIPS D&B 2024 Oral)

Python 170 8 Updated Jan 14, 2025

showlab / Show-o

[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

Python 1,212 51 Updated Feb 10, 2025

apple / ml-slowfast-llava

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Python 202 13 Updated Sep 16, 2024

Alpha-VLLM / Lumina-mGPT

Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"

Python 542 25 Updated Aug 16, 2024

UMass-Foundation-Model / FlexAttention

Official implementation for FlexAttention for Efficient High-Resolution Vision-Language Models

Python 36 5 Updated Jan 8, 2025

ChenLiu-1996 / CitationMap

A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.

Python 498 42 Updated Feb 7, 2025