Skip to content

CUDA: deduplicate FlashAttention code #27

CUDA: deduplicate FlashAttention code

CUDA: deduplicate FlashAttention code #27

Triggered via pull request May 17, 2024 22:53
@NexesenexNexesenex
opened #127
Status Failure
Total duration 1d 8h 27m 49s
Artifacts

bench.yml

on: pull_request_target
Matrix: bench-server-baseline
Fit to window
Zoom out
Zoom in

Annotations

3 errors
bench-server-baseline (phi-2, f16)
This request was automatically failed because there were no enabled runners online to process the request for more than 1 days.
bench-server-baseline (phi-2, q8_0)
This request was automatically failed because there were no enabled runners online to process the request for more than 1 days.
bench-server-baseline (phi-2, q4_0)
This request was automatically failed because there were no enabled runners online to process the request for more than 1 days.