Skip to content

Latest commit

 

History

History
156 lines (140 loc) · 6.19 KB

index.markdown

File metadata and controls

156 lines (140 loc) · 6.19 KB
layout title
default
Home

Contents

  • TOC {:toc}

About

Running machine learning (ML) on embedded edge devices, as opposed to in the cloud, is gaining increased attention for multiple reasons such as privacy, latency, security, and accessibility. Given the need for energy efficiency when running ML on these embedded platforms, custom processor support and hardware accelerators for such systems could present the needed solutions. However, ML acceleration on microcontroller-class hardware is a new area, and there exists a need for agile hardware customization for tiny machine learning (tinyML). Building ASICs is both costly and time-consuming, though, and the opportunity exists with an FPGA platform to customize the processor to adapt it to perform the application’s computation efficiently while adding a small amount of custom hardware that exploits the bit-level flexibility of FPGAs.

To this end, we present CFU Playground, a full-stack open-source framework that enables rapid and iterative design of tightly-coupled accelerators for tinyML systems. Our toolchain integrates open-source software, RTL generators, and FPGA tools for synthesis, place, and route. This full-stack development framework gives engineers access to explore bespoke architectures that are customized and co-optimized for tinyML. The rapid deploy-profile-optimization feedback loop lets ML hardware and software developers achieve significant returns out of a relatively small investment in customization for repetitive ML computations. CFU Playground is available as an open-source project here: https://github.com/google/CFU-Playground.

What is the goal of the workshop?

  • What are some of the challenges and opportunities for designing tinyML hardware?
  • How can we design and develop model-specific accelerators quickly on FPGAs?
  • Get hands-on knowledge on how to build an ML accelerator using CFU playground!

Who is the audience for this workshop?

New ML accelerators are being announced and released each month for a variety of applications. However, the large cost & complexity associated with designing an accelerator, integrating it into a larger System-on-Chip, and developing its software stack has made it a non-trivial task that is difficult for one to rapidly iterate upon. Attendees will be able to deploy their very own accelerated ML solutions within minutes, empowering them to explore the breadth of opportunity that exists in hardware acceleration. This in conjunction with the relevance and excitement surrounding ML today should welcome people with many different backgrounds and interests in ML, FPGAs, embedded systems, computer architecture, hardware design, and software development.

Scope and Topics

  • Custom Hardware Acceleration on FPGAs
  • Tiny Machine Learning (TinyML)
  • Open-Source Tools/Frameworks for HW & SW Development (Full-Stack)

Requirements

Pre-requisites

  • Familiarity with ML “cycle” (inputs, preprocessing, training, inference, etc.)
  • Knowledge of computer organization (datapath, registers, opcodes, etc.)
  • Basic experience with HDL & synthesis concepts for FPGAs
    • Having used Vivado before & Linux OS is a big plus
    • Must come to the workshop with Vivado already installed
  • Need to know C and Python

Hardware

  • Renode will be used to emulate Arty A7-35T

Software

  • All software (RISCV toolchain, Symbiflow, etc.) installed in via environment pre-packaged with CFU Playground.

Schedule

Time Material/Activity
1:00 PM Welcome & Tiny Machine Learning (TinyML)
  • General overview of tinyML as a field
  • What are the common use cases
  • What kind of models do we run
  • What are the typical resource constraints, challenges, etc.
1:30 PM TensorFlow Lite Microcontrollers (TFLM)
  • Challenges for running tinyML models
  • TF vs. TFLite vs TFLite Micro - deep dive
  • Profiling and benchmarking tinyML systems
2:00 PM Benchmarking of TinyML Systems
  • General overview of CFU
  • Make sure Vivado hardware manager can find board
  • Install RISC-V toolchain
  • Pass golden tests
2:30 PM Custom Function Units
  • General overview of CFU
  • Make sure Vivado hardware manager can find board
  • Install RISC-V toolchain
  • Pass golden tests
3:00 PM Introduction to Amaranth
  • What is Litex
  • Explain basic Litex SoC with an example
  • Walkthrough simple end-to-end example from README
3:30 PM Renode/Antmicro
  • TBD
4:00 PM Accelerate your own TinyML Model
  • Pick a task and train a model using TFLM
  • Get it running on the board
  • Build your own CFU
  • Measure performance speed up