-
Institute of Computing Technology, CAS
- Beijing
Highlights
- Pro
Starred repositories
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
Code for EMNLP 2024 paper "Learn Beyond The Answer: Training Language Models with Reflection for Mathematical Reasoning"
800,000 step-level correctness labels on LLM solutions to MATH problems
A novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.
A paper list of some recent works about Token Compress for Vit and VLM
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath
🔥🔥 LLaVA++: Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.
Resources of deep learning for mathematical reasoning (DL4MATH).
一些用于互联网算法岗面试复习用的常见手撕代码合集:排序算法、最短路算法、二叉树遍历算法、sql语句、nms算法、IOU算法、多头注意力MHA等
2024 Alibaba Global Mathematics Competition AI Track Global 2nd Place Project (Agent Universe)
[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?
Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
OpenGPT 4o is a free alternative to OpenAI GPT 4o
[NeurIPS 2023]DDCoT: Duty-Distinct Chain-of-Thought Prompting for Multimodal Reasoning in Language Models
[CVPR 2024] Official Code for the Paper "Compositional Chain-of-Thought Prompting for Large Multimodal Models"
Official implementation for "Multimodal Chain-of-Thought Reasoning in Language Models" (stay tuned and more will be updated)
Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
A bug-free and improved implementation of LLaVA-UHD, based on the code from the official repo
A modular graph-based Retrieval-Augmented Generation (RAG) system