Skip to content
@ucbepic

EPIC Data Lab

Effective Programming Interaction and Computation with Data

Popular repositories Loading

  1. docetl docetl Public

    A system for agentic LLM-powered data processing and ETL

    Python 2.9k 307

  2. TWIX TWIX Public

    TWIX is an open-source data extraction tool that reconstructs structured data from documents at scale, accurately and at low cost, by inferring the shared underlying visual template across documents

    Python 206 14

  3. BARGAIN BARGAIN Public

    Low-Cost LLM-Powered Data Processing with Theoretical Guarantees

    Python 29 3

  4. data-agent-benchmark-study data-agent-benchmark-study Public

    Welcoming contributions from practitioners building AI/data systems - share your real-world problems, document where current tools fail, and help improve the benchmark taxonomy across the enterpris…

    Python 12 2

  5. pdf_parser pdf_parser Public

    Parse PDFs using computer vision, layout analysis, and other state-of-the-art document intelligence techniques. WebApp implemented in Flask/Jinja2 with infer and train pipelines managed by FlorDB

    JavaScript 9 1

  6. docetl-examples docetl-examples Public

    Examples of docetl pipelines

    Python 2

Repositories

Showing 7 of 7 repositories

Top languages

Loading…

Most used topics

Loading…