Description
Describe the bug
device.queue.read_buffer
times out
To Reproduce
git clone https://github.com/tinygrad/tinygrad.git
cd tinygrad
python3 -m pip install -e .
PYTHONPATH=. WEBGPU=1 python3 examples/stable_diffusion.py --seed 0 --steps 1
Output:
File "/Users/ahmedharmouche/Documents/tinygrad/tinygrad/runtime/ops_webgpu.py", line 56, in _copyout
buffer_data = self.dev.queue.read_buffer(src, 0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/ahmedharmouche/Documents/tinygrad/venv/lib/python3.12/site-packages/wgpu/backends/wgpu_native/_api.py", line 3427, in read_buffer
tmp_buffer._map("READ_NOSYNC").sync_wait()
File "/Users/ahmedharmouche/Documents/tinygrad/venv/lib/python3.12/site-packages/wgpu/backends/wgpu_native/_helpers.py", line 268, in sync_wait
return self.finish()
^^^^^^^^^^^^^
File "/Users/ahmedharmouche/Documents/tinygrad/venv/lib/python3.12/site-packages/wgpu/backends/wgpu_native/_helpers.py", line 260, in finish
raise RuntimeError(f"Waiting for {self.title} timed out.")
RuntimeError: Waiting for buffer.map timed out.
zsh: segmentation fault PYTHONPATH=. WEBGPU=1 python3 examples/stable_diffusion.py --seed 0 --steps 1
Observed behavior
When running tinygrad stable_diffusion.py, the buffer read times out when trying to get the output of the decode step. But it is not the buffer reading that takes that long, but to actually run the compute. Manually increasing the timeout from 5.0s here solves it, but in 0.18.1
this just worked (the timeout wasn't there?). Now, we have stable diffusion working on faster machines, but on my local computer, it times out, so I have to manually increase this timeout value, so for now we downgraded to 0.18.1
.
Can this timeout be increased/disabled?
Your environment
OS: MacOS Sonoma 14.4.1
Python version: 3.12
wgpu-py version: >=0.19.0
wgpu backend: Metal