Skip to content

VinInn/MallocProfiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MallocProfiler

a profiler of "malloc" activities".

It will trace location (stacktrace) and size of memory allocations (malloc, etc) for each thread and report them at the end of the process. The dump tries to reproduce the flamegraph input format (https://github.com/jlfwong/speedscope/wiki/Importing-from-custom-sources#brendan-greggs-collapsed-stack-format) accepted by speedscope as well. An API is provided to configure it and get reports on user request.

Besides providing such a detailed map, the tool also accumulate statistics for both total-memory and live-memory in form of counters and histogram.

Prerequisite

GCC12 or newer. A version not older than Nov 15, 2023. configured with --enable-libstdcxx-backtrace=yes. This tool has been tested with GCC14.

Quick Start

clone this repository ad cd in it.

source compile

export LD_PRELOAD=./mallocProfiler.so

invoke the application to profile and filter the profile using grep _mptrace selecting the field of your choice (see below)

export LD_PRELOAD=""

drop the resulting file in speedscope (or generate a flamegraph svg)

Example

go in the demos directory and run the trivial python example (taken from a numpy tutorial)

export LD_PRELOAD=../mallocProfiler.so
python3 demo.py | grep _mpTrace | cut -f1,3 -d'$' | tr '$' ' ' > & pyDemo.md

drop the resulting file (pyDemo.md) in https://www.speedscope.app

select the sandwich view, sort by total, click on file_rules and one should get an output like this one showing the typical huge call stacks of python

image

Instrumenting user code

demos/instrumentationDemo.cpp contains a simple example of how to instrument user code: it is supposed to track and report all allocations performed while filling a hash-map (std::unordered_map)

compile it with

c++ -g instrumentationDemo.cpp ../dummyMallocProfiler.so -o instrumentationDemo

preload the profiler disabled by default and run it

export LD_PRELOAD=../mallocProfilerOFF.so
../instrumentationDemo

compile it again activating the reserve call

c++ -g instrumentationDemo.cpp ../dummyMallocProfiler.so -o instrumentationDemo -DRESERVE

and compare the two outputs

User API

The user API, to configure the profiler and to instrument the code, is all in the header file include/mallocProfiler.h and is documented inline.

A simple mechanism to configure the profiler w/o instrumenting the code is to introduce a middle-library to be preloaded after the profiler itself. An example can be found in tests/testConfiguration.cc

Global Statistics

It is easy to switch off detailed tracing and just accumulate global statistics. The ready to use statOnlyThread.so library will start a thread that each 10 seconds will dump in a file (named memstat_PID.mdr) three lines containing global statistics, the histogram of total memory and the one of live memory. This file can then be split in three csv-files with some trivial grep and sed and read using a visualization tool. Exemples of such files can be found in the demos directory togehter with a jupyter notebook to visualize them in form of time-serie plots and histogram animations.

About

a profiler of "malloc" activities"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages