- You do not optimize.
- You do not optimize, without measuring first.
- When the performance is not bound by the code, but by external factors, the optimization is over.
- Only optimize code that already has full unit test coverage.
- One factor at a time.
- No unresolved bugs, no schedule pressure.
- Testing will go on as long as it has to.
- If this is your first night at Optimization Club, you have to write a test case.
See more here
-
Don't do it : Can we avoid doing the calculation at all? For example: Do we need to parse the input or just pass it as-is?
-
Do it, but don't do it again : Can we use memoization/caching? Parse objects once at the "edges" and use the parsed objects internally.
-
Do it less : Do we need to run this every millisecond? Can every second work? Can we use only a subset of the data?
-
Do it later : Can we make this API call async?
-
Do it when they're not looking : Can we run the calculation in the background while doing another task?
-
Do it concurrently : Will concurrency help here? Consider Amdhal's law.
-
Do it cheaper : Can we use a map here instead of a slice? Research available algorithms and data structures and know their complexity. Test them on your data
-
Memory Allocation : Avoid allocations as possible (see the design of io.Reader). Pre-allocate if you already know the size. Be careful of slices keep large amounts of memory (
s := make([]int, 1000000)[:3]
) -
defer
might slow you Down : However consider the advantages. -
strings are immutable : Use bytes.Buffer or strings.Builder
-
Know when a goroutine is going to stop : Avoid goroutine leaks. Use context for cancellation/timeouts.
-
cgo
calls are expensive : Group them together in onecgo
call. -
Channel can be slower than
sync.Mutex
: However they are much easier to work with -
Interface calls are more expensive the struct calls : You can extract the value from the interface first. However it's less generic code.
-
Use
go run -gcflags=-m
: You'll see what escapes to the heap.
-
Know thy Hardware : CPU affinity, CPU cache, memory, latency numbers .... For example: Cache-oblivious algorithms
-
Algorithms & Data structures Rule : They will usually give you much better performance than any other trick.
-
Include performance in your process : Design & code reviews, run & compare benchmarks on CI ...