Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stim.CompiledDetectorSampler.sample(..., dets_out=None, obs_out=None) #782

Merged
merged 1 commit into from
Jun 11, 2024

Conversation

Strilanc
Copy link
Collaborator

@Strilanc Strilanc commented Jun 11, 2024

  • Add obs_out and dets_out parameters to the sinter-hot-path detection event sampling method
  • Rewrite bit-table-to-numpy code to allow passing in the buffer to write to
  • Avoids doing large allocations for every call to sample
  • Avoids some extra copies that were previously present
  • Also, release the GIL when doing the actual frame simulation call

Benchmarked by taking 250 instances of 1024 shots from a distance 11 surface code running for 33 rounds:

  • Old version: 3.16 seconds
  • New version (no buffer): 2.36 seconds
  • New version (yes buffer): 2.34 seconds

So... the buffer appears to not be hugely significant, but the copy reduction was very useful.

import numpy as np
import stim
import time

circuit = stim.Circuit.generated(
    "surface_code:rotated_memory_x",
    distance=11,
    rounds=33,
    after_clifford_depolarization=1e-3,
    before_measure_flip_probability=1e-3,
    after_reset_flip_probability=1e-3,
    before_round_data_depolarization=1e-3,
)
sampler = circuit.compile_detector_sampler()

det_buf = np.empty((1024, (circuit.num_detectors + 7) // 8), dtype=np.uint8)
obs_buf = np.empty((1024, (circuit.num_observables + 7) // 8), dtype=np.uint8)
t0 = time.monotonic()

if True:
    for _ in range(250):
        sampler.sample(
            shots=1024,
            bit_packed=True,
            dets_out=det_buf,
            obs_out=obs_buf,
        )
else:
    for _ in range(250):
        sampler.sample(
            shots=1024,
            bit_packed=True,
        )
t1 = time.monotonic()
dt = t1 - t0
print(dt)
print(dt / 1024)
print(dt / 1024 / 1024)
print(dt / 1024 / 1024 / circuit.num_detectors)

…None)`

- Rewrite bit-table-to-numpy code to allow passing in the buffer to write to
- Avoids doing large allocations for every call to sample
- Avoids some extra copies that were previously present

Benchmarked by taking 250 instances of 1024 shots from a distance 11 surface code
Old version: 3.16 seconds
New version (no buffer): 2.36 seconds
New version (yes buffer): 2.34 seconds

So... the buffer appears to not be hugely significant, but the copy reduction was useful.
@Strilanc Strilanc enabled auto-merge (squash) June 11, 2024 04:09
@Strilanc Strilanc merged commit 320288c into main Jun 11, 2024
53 checks passed
@Strilanc Strilanc deleted the outbuf branch June 11, 2024 04:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant