Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added core_array code and activation function code #185

Open
wants to merge 121 commits into
base: master
Choose a base branch
from

Conversation

zymeng3001
Copy link

No description provided.

element-wise floating-point addition of two input data sets using pipeline
This module defines a parameterized global SRAM with configurable data width and depth, handling read and write operations using clock, address, and control signals. It instantiates a single-port memory (mem_sp) to perform the actual memory operations.
This module implements a clock divider that progressively divides an input clock signal by powers of two, from 2 up to 2048, with each divided clock driving the next stage.
This PLL module outputs a clock signal, clk_out, based on a 4-bit input configuration clk_cfg. The output clock is set to 1 if any of the configuration bits are high; otherwise, it is set to 0.
This Verilog module defines a controller for managing operations such as memory access, vector engine operations, and dataflow within a system, handling components like weight loading, attention mechanisms, and compute logic for Q, K, and V generations. It uses multiple finite state machines, counters, and control signals to manage the data flow between memory, global SRAM, and buffers, coordinating the execution of tasks across various stages of nanoGPT.
The current controller improves memory and buffer usage by limiting connections to necessary cores, optimizing Q write-back across different SRAMs, and disabling the ping-pong mechanism for residual paths. Additionally, it introduces more efficient attention operations with refined counters and interleaving logic to enhance data reuse and parallelism across heads during computation.
controller_bak.sv uses fixed values for parameters like CORE_ADDR_CNT and ABUF_CNT, while the earlier controllers calculate these dynamically. Additionally, controller_bak.sv has more detailed FSM logic and counter management, especially for handling QGEN, KGEN, VGEN, and attention operations
Module for SPI master functionality, handling data transmission as the master device.
SPI_SLAVE_TOP.sv: Top-level module for SPI slave, integrating various slave components.
SPI_Slave.v: Module for the core functionality of an SPI slave.
SPI_Slave_org.v: Original version of the SPI slave module, possibly kept for reference or testing.
SPI_TOP_QIRUI.sv: Top-level module for the SPI system, integrating both master and slave components.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants