Skip to content

Poor multithreaded performance #343

Open
@pikrzysztof

Description

@pikrzysztof

I have an application which was reading capnproto structures from a network and feeding it to ~600 goroutines which would try analyzing it (read-only) in parallel. This caused a massive lock contention, where according to perf, ReadLimiter's spinlock only was responsible for 40% of cumulative CPU usage, please see flamegraph.html.gz (compressed to avoid github from stripping embedded JS)

Once we adapted 1b552dd application's CPU usage dropped more than 10x, from 60 cores to <6 cores.

I think this library might benefit from replacing spinlocks with sleeping locks. Spinlocks are generally not the best choice for userspace where you can't guarantee having the CPU while holding the lock. In case your goroutine gets the lock and gets descheduled other application threads will be stupidly burning CPU cycles until both kernel and golang scheduler agree to give CPU to the goroutine holding the lock. Another disadvantage is that CPU usage poorly correlates with the amount of actual work achieved and can go up unbounded.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions