Skip to content
/ expo Public
forked from miku/expo

Exercises in performance optimizations (GOLAB 2024)

License

Notifications You must be signed in to change notification settings

Noxiderp/expo

 
 

Repository files navigation

Exercices in performance optimizations (expo)

Workshop at GOLAB 2024, 2024-11-11, 14:30, Martin Czygan, LI

Slides

Abstract

The 1 Billion Row Challenge is a simple, data-intensive task, that nonetheless allows to explore many optimization ideas and techniques in Go.

In this workshop, we start with a baseline implementation and interactively improve on the solution, learning about benchmarking, different performance characteristics of standard library types, concurrency patterns, fast data structures, useful operating system facilities and more.

Overview

  • Benchmarking
    • writing a benchmark
    • running a benchmark
  • Profiling
    • cpu profiling
    • generating a flame graph
  • 1BRC problem outline
    • problem description
  • A baseline implementation

Variations:

  • Caring about allocations
    • ReadString
    • Scanner
    • Scanner buffer size
  • Faster string parsing
    • splitting a string
    • parsing a float
    • parsing a float with SWAR
  • Parallel processing
    • worker pattern
    • splitting the file
  • Using memory-mapped files
    • simplifying the api
  • Using a custom hash table
    • custom hash table

Benchmarking mechanics

  • benchmark small snippets, separately
  • basic benchmark with time

Areas of optimization

  • buffered I/O
  • allocations, e.g. ReadString vs Scanner
  • better buffer sizes, e.g. as passed to Read(...)
  • parsing a string
  • parsing a float as int
  • using a memory-mapped file
  • parallel processing
  • optimal number of batch size and number of workers
  • custom hash function

About

Exercises in performance optimizations (GOLAB 2024)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 99.4%
  • Other 0.6%