Poor multithreaded performance

I have an application which was reading capnproto structures from a network and feeding it to ~600 goroutines which would try analyzing it (read-only) in parallel. This caused a massive lock contention, where according to `perf`, ReadLimiter's spinlock only was responsible for 40% of cumulative CPU usage, please see [flamegraph.html.gz](https://github.com/capnproto/go-capnproto2/files/10076439/flamegraph.html.gz) (compressed to avoid github from stripping embedded JS)

Once we adapted https://github.com/capnproto/go-capnproto2/pull/131/commits/1b552dde05a0737eafff2ad8f4021ef8a74e1dbc application's CPU usage dropped more than 10x, from 60 cores to <6 cores. 

I think this library might benefit from replacing spinlocks with sleeping locks. Spinlocks are generally not the best choice for userspace where you can't guarantee having the CPU while holding the lock. In case your goroutine gets the lock and gets descheduled other application threads will be stupidly burning CPU cycles until both kernel and golang scheduler agree to give CPU to the goroutine holding the lock. Another disadvantage is that CPU usage poorly correlates with the amount of actual work achieved and can go up unbounded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Poor multithreaded performance #343

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Poor multithreaded performance #343

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions