Skip to content

Perun Bachelor's and Master's Theses Topics

JiriPavela edited this page Jul 17, 2024 · 1 revision

Here we provide a list of ideas for possible Bachelor's or Master's theses related to Perun.

Make Ktrace More Robust

Ktrace is our current prototype implementation of a linux kernel tracer using eBPF (libbpf). However, Ktrace is missing a lot of core functions and features that are necessary for adoption of Ktrace by kernel teams (e.g., in Red Hat). The goal is to design, implement and test some of the missing features, e.g.,

Redesigning Perun Profiles

Perun currently uses a single JSON profile format for all types of performance data. However, this format is not particularly memory efficient and it becomes increasingly difficult to support all types of performance data in a single format efficiently.

  1. Design and develop a new Profile architecture that either supports efficient storage of all types of performance data in a single format, or supports multiple Profile types that can store different kinds of performance data (e.g., samples, events, snapshots, etc.).
  2. Design and develop such Profile format(s) and extensively evaluate its/their memory and time requirements. Focus on designing as efficient format(s) as possible using e.g., compression techniques, binary formatting, compact representation of duplicate data entries, etc.

Perun Profile Queries

Design and develop a query system for Perun profiles (e.g., inspired by LinQ) that will simplify retrieving or querying data from Perun profiles without the need to understand the exact Perun profile. This topic is closely related to the previous topic and should either be merged together into a single thesis, or assigned as a follow-up work.

Integrating Mature Profiling Tools Into Perun

Perun currently uses mostly, but not exclusively, custom built profilers and tools for performance analysis, with some notable exceptions being our perf, Loopus or Cost wrappers. However, to make Perun easier to adopt by development teams, we should integrate well-established tools used by the performance and QA community, e.g.,

  • Callgrind and its front-end KCachegrind. KCachegrind could also be extended to support diff view of multiple profiles.
  • Memcheck, Cachegrind, Massif, ...
  • Diffkemp.

Combining Performance Profiles

Design and implement a technique to combine multiple profiles with different performance metrics, e.g., memory consumption, function calls, trace stack samples, etc. into a single multi-faceted performance overview. A possible extension of this topic is to add support for safe collection of multiple performance metrics at the same time during a single profiling run. This topic may also be merged with the previous one.

Efficient Processing of Large Number of Traces

Some Perun profilers preserve the full (stack) trace context of function calls, which may result in high tens of thousands of traces in the resulting profiles. As a consequence, diff analysis and diff view of such profiles needs to pair traces from both profiles to identify missing, new and modified traces. This becomes extremely costly for long runs or complex programs. Hence, efficient techniques for matching large sets of traces is necessary. Also, efficient storage of such large number of traces consumes significant amount of space and should be optimized.

Note: This topic is rather experimental and will likely require studying state-of-the-art approaches and algorithms.

Enhancing LLVM Compile-time Profiler

A rather broad set of possible topics that would build on our new, and not yet fully integrated, LLVM compile-time instrumentation tool. Some possible directions are, e.g.,

  • extending the current basic block instrumentation with new types of instrumentation primitives, e.g., function calls and returns, specific sets of instructions, shared memory manipulation, etc.;
  • further optimization of the basic block instrumentation using CFG patterns or efficient path representation: Ball-Larus Efficient Path Profiling https://dl.acm.org/doi/10.5555/243846.243857.

Leveraging AI in Perun

Investigate possible use-cases for AI, Machine Learning or Machine Reasoning within Perun. This may include designing new profiling optimizations (e.g., identifying functions that are not interesting performance-wise and as such could be omitted from profiling), predicting performance changes based on the source code changes, or aiding the user in finding the root cause of a performance issue.

Note: This topic is highly experimental and assumes the student is somewhat familiar with designing AI, ML or MR systems.

Performance Runtime Verification in Perun

Research the state-of-the-art of (performance) runtime verification and investigate the possibilities of implementing some of the algorithms or techniques within Perun. Integrating and enhancing existing runtime verification tools is also a possibility.

This topic may require implementation of LTL or CTL logic support for specification of verification properties in Perun.