Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor py-spy & pyperf to separate ProfilerInterface. #805

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

marcin-ol
Copy link
Collaborator

Refactor py-spy & pyperf to separate ProfilerInterface classes.

Description

Group profilers by common runtime (e.g.: Python, Java, etc.).
Support automatic selection of best available profiler for each runtime.

Related Issue

#500

Motivation and Context

We have PythonProfiler which is a class that hides the fact that we have 2 profiler types behind - py-spy or pyperf, and decides itself which one is used.
There was no mechanism to support multiple profiler options and keep them separate. More profilers will arrive in future that will benefit from this.

Also: #448

How Has This Been Tested?

Additional tests developed for new mechanism, added tests to ensure correct behavior of Python profiling.

Checklist:

  • I have read the CONTRIBUTING document.
  • I have added tests for new logic.

@marcin-ol marcin-ol changed the title Support grouping profilers under common runtime Refactor py-spy & pyperf to separate ProfilerInterface. Jun 20, 2023
@marcin-ol
Copy link
Collaborator Author

marcin-ol commented Jun 20, 2023

Remaining actions for this PR:

  • Register PySpyProfiler and EbpfProfiler separately for python, using developed mechanism.
  • Organize profilers options: supported modes, runtime arguments should be either well separated or declared with runtime (Python) itself.

@marcin-ol marcin-ol force-pushed the refactor-decouple-pyperf-and-pyspy branch from 165292f to c11b6a1 Compare June 28, 2023 10:43
tests/conftest.py Outdated Show resolved Hide resolved
@marcin-ol marcin-ol requested a review from Jongy July 6, 2023 11:45
@Jongy Jongy added the enhancement New feature or request label Jul 8, 2023

# Reuse gProfiler flow to initialize specific profiler instance - one selected by runtime fixture.
@contextmanager
def _make_profiler_instance() -> Iterator[ProfilerBase]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to create profiler instances in such implicit way and not directly as we did before?

Personally I prefer the explicit way. This new flow is harder to understand and is more error prone :/

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in hindsight that could be simpler.

)
wait_for_log(gprofiler, "gProfiler initialized and ready to start profiling", 0, timeout=7)
assert f"Initialized {profiler_class_name}".encode() in gprofiler.logs()
gprofiler.remove(force=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

force=True uses SIGKILL, preventing gprofiler from doing proper cleanup. Why?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No specific reason, I think I made a shortcut here.
Will replace it.

"Python",
profiler_name="PyPerf",
is_preferred=True,
# py-spy is like pyspy, it's confusing and I mix between them
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment not relevant here anymore

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right.

def get_runtime_possible_modes(runtime: str) -> List[str]:
possible_modes: List[str] = [ProfilerConfig.ENABLED_MODE] if len(profilers_config[runtime]) > 1 else []
for config in profilers_config[runtime]:
possible_modes += [m for m in config.get_active_modes() if m not in possible_modes]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking for existence you can build it as a set and then convert to a list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

arch = get_arch()
profiler_configs = sorted(
profilers_config[runtime],
key=lambda c: (arch in c.get_supported_archs(), c.is_preferred, c.profiler_name),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsupported arches should be filtered out, not sorted away, no?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they should be.
This was needed for some earlier revision of this PR and will be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a second thought, this has one implication - we won't be able to explain to the user why profiler wasn't selected:

  • factory.get_profilers() is walking through this list of sorted profilers and will warn if some profilers are unavailable (on given architecture),
  • if we get an already-filtered list, user will only learn that no profilers were selected.

To be on the safe side we should filter available profilers when building list of command-line arguments. Therefore another function needs changing - _add_profilers_arguments(), to do this filtering.

Comment on lines 35 to 36
runtime_args_prefix = runtime.lower().replace("-", "_")
runtime_mode = user_args.get(f"{runtime_args_prefix}_mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why do you need replace(-, _) on the runtime name?
  2. This logic (these 2 lines) repeat more than once, extract to a function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed now. Was used when exploring different approaches.

Comment on lines 32 to 33
return system_profiler, process_profilers_instances
arch = get_arch()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add a blank line between these 2 lines for clarity

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

logger.critical(
f"Couldn't create the {profiler_name} profiler for runtime {runtime}, not continuing."
f" Request different profiler for runtime with --{runtime_args_prefix}-mode, or disable"
f" {runtime} profiling with --no-{runtime_args_prefix} to disable this profiler",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f" {runtime} profiling with --no-{runtime_args_prefix} to disable this profiler",
f" {runtime} profiling with --{runtime_args_prefix}-mode=none to disable this profiler",

I prefer this way of expressing it (--no-... is legacy to me)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

process_profilers_instances.append(profiler_instance)

if len(requested_configs) == 1:
logger.critical(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new fatal error. In which situations can it happen? When I force --python-mode=pyperf but PyPerf can't be used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the case.
Previously it was c-tor that could raise an Exception (i.e.: EbpfProfiler.test() called from PythonProfiler.__init__()).

Now the test is wrapped in check_readiness().
If it was the only available profiler (by the size of requested_configs), we raise an Exception, as previously.
Backward compatibility is retained.

"Python",
profiler_name="PySpy",
# py-spy is like pyspy, it's confusing and I mix between them
possible_modes=["auto", "pyspy", "py-spy"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "auto" mean? This profiler no longer supports any "modes", it's just py-spy.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works the same as enabled and is listed here to support the previous behavior.

Copy link
Contributor

@Jongy Jongy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed most of it.

Issues/problems I spot:

  1. Profilers still need to specify default_mode and possible_modes, which no longer makes sense as each profiler class now represents a single profiler and a single mode. IMO this is one of the problems we were trying to solve in this PR.
  2. As you mentioned, we lack the concept of "runtime arguments". This is visible for Python which has 2 profilers, and will be visible in others as well if we ever add more profielrs (e.g in Java, some of the arguments are truly async-profiler specific and some are not and will be relevant to any java profiler, e.g --java-collect-jvm-flags). "Runtime arguments" is where we could have placed the default_mode and possible_mode arguments.

My proposal:

  1. let's add the concept of a runtime object, and profilers will be attached to their runtime object. When generating commandline arguments, they will be taken into account separately (i.e there will be a section "Java arguments" and then "Java async-profiler arguments").
  2. possible_modes will be the default ones + the different registered runtime profilers for this runtime.
  3. At the same time we can rename JavaProfiler to JavaAsyncProfiler
  4. This simplifies get_profiler_arguments

@marcin-ol
Copy link
Collaborator Author

I reviewed most of it.

Issues/problems I spot:

  1. Profilers still need to specify default_mode and possible_modes, which no longer makes sense as each profiler class now represents a single profiler and a single mode. IMO this is one of the problems we were trying to solve in this PR.

That would be okay but I see a few opportunities:

  • SystemProfiler supports 3 possible on-modes: smart, dwarf, and fp. Either we refactor it with this PR (but that wouldn't align with the intention of this PR) or keep possible_modes for now, but add a new issue to solve it.
  • python profiling won't be possible with "auto" mode, but I suppose it wasn't used frequently?

And issue with default_mode:

  • DotnetProfiler is disabled by default, with use of default_mode parameter; I think we might still need it for newly introduced profilers. I propose to keep this argument but at the Runtime level, renamed to: enabled_by_default
  1. As you mentioned, we lack the concept of "runtime arguments". This is visible for Python which has 2 profilers, and will be visible in others as well if we ever add more profielrs (e.g in Java, some of the arguments are truly async-profiler specific and some are not and will be relevant to any java profiler, e.g --java-collect-jvm-flags). "Runtime arguments" is where we could have placed the default_mode and possible_mode arguments.

My proposal:

  1. let's add the concept of a runtime object, and profilers will be attached to their runtime object. When generating commandline arguments, they will be taken into account separately (i.e there will be a section "Java arguments" and then "Java async-profiler arguments").

I'm all for it. I wanted to make it clear (and convince myself too) that it's really needed, that's why I paused implementation at this stage.

  1. possible_modes will be the default ones + the different registered runtime profilers for this runtime.

I agree we still need default mode - for the likes of DotnetProfiler and experimental ones.

  1. At the same time we can rename JavaProfiler to JavaAsyncProfiler

And register with profiler_name=ap to support auto-generated possible_mode=ap?

  1. This simplifies get_profiler_arguments

Sure.

@marcin-ol
Copy link
Collaborator Author

As discussed earlier, I have introduced runtime object in form of ProfilingRuntime.
Runtime defines shared attributes of profilers: default_mode, mode_help, disablement_help and common_arguments.

  • Few things were simplified - get_profiler_arguments, get_preferred_or_first_profiler were discarded, as now the common and profiler-specific arguments are separate.
  • New concept was added - InternalArgument to handle arguments that are set in the code, not by arg parser, i.e.: perf_node_attach, perf_inject.

@marcin-ol marcin-ol requested a review from Jongy July 13, 2023 11:26
"PySpy",
runtime_class=PythonRuntime,
possible_modes=["auto", "pyspy", "py-spy"],
# we build pyspy for both,.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# we build pyspy for both,.
# we build pyspy for both

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -626,7 +626,6 @@ def python_version(in_container: bool, application_docker_container: Container)
else:
# If not running in a container the test application runs on host
output = subprocess.check_output("python3 --version", stderr=subprocess.STDOUT, shell=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undo diff?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Jongy
Copy link
Contributor

Jongy commented Jul 27, 2023

I only briefly reviewed now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants