-
Notifications
You must be signed in to change notification settings - Fork 14
Perun Bachelor's and Master's Theses Topics
Here we provide a list of ideas for possible Bachelor's or Master's theses related to Perun.
Ktrace is our current prototype implementation of a linux kernel tracer using eBPF (libbpf). However, Ktrace is missing a lot of core functions and features that are necessary for adoption of Ktrace by kernel teams (e.g., in Red Hat). The goal is to design, implement and test some of the missing features, e.g.,
- #161 Automate extraction of probe locations using perf
- #162 Automate filtering of probe locations
- #163 Support more probe types and their combinations in a single BPF program
- #164 Enhance the BPF program to trace multiple processes
- #165 Enhance the BPF program to monitor scheduling during tracing
- #166 Spawn and synchronize the ktrace and benchmark processes
- #168 Implement event parsing directly in the userspace C program
- #233 Add support of inline functions
Perun currently uses a single JSON profile format for all types of performance data. However, this format is not particularly memory efficient and it becomes increasingly difficult to support all types of performance data in a single format efficiently.
- Design and develop a new
Profile
architecture that either supports efficient storage of all types of performance data in a single format, or supports multipleProfile
types that can store different kinds of performance data (e.g., samples, events, snapshots, etc.). - Design and develop such
Profile
format(s) and extensively evaluate its/their memory and time requirements. Focus on designing as efficient format(s) as possible using e.g., compression techniques, binary formatting, compact representation of duplicate data entries, etc.
Design and develop a query system for Perun profiles (e.g., inspired by LinQ) that will simplify retrieving or querying data from Perun profiles without the need to understand the exact Perun profile. This topic is closely related to the previous topic and should either be merged together into a single thesis, or assigned as a follow-up work.
Perun currently uses mostly, but not exclusively, custom built profilers and tools for performance analysis, with some notable exceptions being our perf, Loopus or Cost wrappers. However, to make Perun easier to adopt by development teams, we should integrate well-established tools used by the performance and QA community, e.g.,
- Callgrind and its front-end KCachegrind. KCachegrind could also be extended to support diff view of multiple profiles.
- Memcheck, Cachegrind, Massif, ...
- Diffkemp.
Design and implement a technique to combine multiple profiles with different performance metrics, e.g., memory consumption, function calls, trace stack samples, etc. into a single multi-faceted performance overview. A possible extension of this topic is to add support for safe collection of multiple performance metrics at the same time during a single profiling run. This topic may also be merged with the previous one.
Some Perun profilers preserve the full (stack) trace context of function calls, which may result in high tens of thousands of traces in the resulting profiles. As a consequence, diff analysis and diff view of such profiles needs to pair traces from both profiles to identify missing, new and modified traces. This becomes extremely costly for long runs or complex programs. Hence, efficient techniques for matching large sets of traces is necessary. Also, efficient storage of such large number of traces consumes significant amount of space and should be optimized.
Note: This topic is rather experimental and will likely require studying state-of-the-art approaches and algorithms.
A rather broad set of possible topics that would build on our new, and not yet fully integrated, LLVM compile-time instrumentation tool. Some possible directions are, e.g.,
- extending the current basic block instrumentation with new types of instrumentation primitives, e.g., function calls and returns, specific sets of instructions, shared memory manipulation, etc.;
- further optimization of the basic block instrumentation using CFG patterns or efficient path representation: Ball-Larus Efficient Path Profiling https://dl.acm.org/doi/10.5555/243846.243857.
Investigate possible use-cases for AI, Machine Learning or Machine Reasoning within Perun. This may include designing new profiling optimizations (e.g., identifying functions that are not interesting performance-wise and as such could be omitted from profiling), predicting performance changes based on the source code changes, or aiding the user in finding the root cause of a performance issue.
Note: This topic is highly experimental and assumes the student is somewhat familiar with designing AI, ML or MR systems.
Research the state-of-the-art of (performance) runtime verification and investigate the possibilities of implementing some of the algorithms or techniques within Perun. Integrating and enhancing existing runtime verification tools is also a possibility.
This topic may require implementation of LTL or CTL logic support for specification of verification properties in Perun.