Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support rr debugger (record-replay) by allowing the syscall perf_event_open in Gitpod workspaces #9687

Open
2 of 3 tasks
jankeromnes opened this issue May 2, 2022 · 22 comments
Labels
meta: never-stale This issue can never become stale team: workspace Issue belongs to the Workspace team type: feature request New feature or request

Comments

@jankeromnes
Copy link
Contributor

Is your feature request related to a problem? Please describe

Debugging software with rr in Gitpod currently doesn't work:

# Install rr
$ cd /tmp && wget https://github.com/rr-debugger/rr/releases/download/5.5.0/rr-5.5.0-Linux-$(uname -m).deb && sudo dpkg -i rr-5.5.0-Linux-$(uname -m).deb

# Try rr with any binary
$ cd - && rr record ./binary
rr needs /proc/sys/kernel/perf_event_paranoid <= 1, but it is 2.
Change it to 1, or use 'rr record -n' (slow).
Consider putting 'kernel.perf_event_paranoid = 1' in /etc/sysctl.d/10-rr.conf.
See 'man 8 sysctl', 'man 5 sysctl.d' (systemd systems)
and 'man 5 sysctl.conf' (non-systemd systems) for more details.

Initially reported by William Durand from Mozilla: https://twitter.com/couac/status/1521092130890031105

Describe the behaviour you'd like

I suspect this fails because Gitpod's seccomp profile disables the syscall perf_event_open by default.

I also believe that we could allow perf_event_open in Gitpod, provided there aren't any major security issues.

This would allow Gitpod users to benefit from the powerful and popular record-replay debugger rr.

Describe alternatives you've considered

Additional context

To work properly, rr needs:

... as well as a seccomp profile that allows:

  • the ptrace syscall (I believe this is allowed by default in Linux kernels >= 4.8)
  • the perf_event_open syscall (I believe this is disabled by default)
  • and maybe the process_vm_writev syscall too (but let's focus on perf_event_open first)

Sources:

@jankeromnes jankeromnes added type: feature request New feature or request team: workspace Issue belongs to the Workspace team labels May 2, 2022
@jankeromnes
Copy link
Contributor Author

William also correctly pointed out that we might want to make sure rr actually supports AMD CPUs first:

I am thinking that we should probably make sure that rr actually supports the AMD CPU first. Ideally we would verify that we can record a trace on a host machine and then within Docker (with seccomp=unconfined).

@willdurand
Copy link

rr needs /proc/sys/kernel/perf_event_paranoid <= 1, but it is 2.
Change it to 1, or use 'rr record -n' (slow).
Consider putting 'kernel.perf_event_paranoid = 1' in /etc/sysctl.d/10-rr.conf.

FWIW, this first warning cannot be solved currently. Creating the /etc/sysctl.d/10-rr.conf file and reloading sysctl will skip the config file:

gitpod /tmp $ echo 'kernel.perf_event_paranoid = 1' | sudo tee /etc/sysctl.d/10-rr.conf
kernel.perf_event_paranoid = 1

gitpod /tmp $ cat /etc/sysctl.d/10-rr.conf
kernel.perf_event_paranoid = 1

gitpod /tmp $ sudo  sysctl --system
[...]

* Applying /etc/sysctl.d/10-rr.conf ...
sysctl: setting key "kernel.perf_event_paranoid", ignoring: Read-only file system

[...]

We can use record -n apparently, though. That being said, with 5.5.0 (installed as described in the issue above), there is another error:

gitpod /tmp $ rr --version
rr version 5.5.0

gitpod /tmp $ rr record -n /usr/bin/ls
[FATAL /home/roc/rr/rr/src/PerfCounters_x86.h:104:compute_cpu_microarch()] AMD CPU type 0xf10 unknown

What do we do now? We build rr ourselves and we try again:

gitpod /tmp/obj $ /usr/local/bin/rr record -n /usr/bin/ls
[FATAL /tmp/rr/src/PerfCounters.cc:224:start_counter() errno: EPERM] Failed to initialize counter
=== Start rr backtrace:
/usr/local/bin/rr(_ZN2rr13dump_rr_stackEv+0x5d)[0x55bf3dfa28b6]
/usr/local/bin/rr(_ZN2rr15notifying_abortEv+0x16)[0x55bf3dfa2815]
/usr/local/bin/rr(_ZN2rr12FatalOstreamD1Ev+0x34)[0x55bf3ddda3d6]
/usr/local/bin/rr(+0x408234)[0x55bf3de0b234]
/usr/local/bin/rr(+0x4083f8)[0x55bf3de0b3f8]
/usr/local/bin/rr(+0x40a01c)[0x55bf3de0d01c]
/usr/local/bin/rr(+0x40a58f)[0x55bf3de0d58f]
/usr/local/bin/rr(_ZN2rr12PerfCounters23default_ticks_semanticsEv+0x21)[0x55bf3de0d74f]
/usr/local/bin/rr(_ZN2rr7SessionC2Ev+0x107)[0x55bf3df2bbff]
/usr/local/bin/rr(_ZN2rr13RecordSessionC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt6vectorIS6_SaIS6_EESD_RKNS_20DisableCPUIDFeaturesENS0_16SyscallBufferingEiNS_7BindCPUES8_PKNS_9TraceUuidEbb+0x65)[0x55bf3de27211]
/usr/local/bin/rr(_ZN2rr13RecordSession6createERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EESB_RKNS_20DisableCPUIDFeaturesENS0_16SyscallBufferingEhNS_7BindCPUERKS7_PKNS_9TraceUuidEbbb+0xc3d)[0x55bf3de26cdf]
/usr/local/bin/rr(+0x416872)[0x55bf3de19872]
/usr/local/bin/rr(_ZN2rr13RecordCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x40f)[0x55bf3de1a7d9]
/usr/local/bin/rr(main+0x27d)[0x55bf3dfbeb2f]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f1a066ee0b3]
/usr/local/bin/rr(_start+0x2e)[0x55bf3dcec6de]
=== End rr backtrace
Aborted (core dumped)

@khuey
Copy link

khuey commented May 3, 2022

rr needs /proc/sys/kernel/perf_event_paranoid <= 1, but it is 2.
Change it to 1, or use 'rr record -n' (slow).
Consider putting 'kernel.perf_event_paranoid = 1' in /etc/sysctl.d/10-rr.conf.

FWIW, this first warning cannot be solved currently. Creating the /etc/sysctl.d/10-rr.conf file and reloading sysctl will skip the config file:

kernel.perf_event_paranoid is a single global config setting for the whole kernel. It can't be set inside a container.

What do we do now? We build rr ourselves and we try again:

gitpod /tmp/obj $ /usr/local/bin/rr record -n /usr/bin/ls
[FATAL /tmp/rr/src/PerfCounters.cc:224:start_counter() errno: EPERM] Failed to initialize counter
=== Start rr backtrace:
/usr/local/bin/rr(_ZN2rr13dump_rr_stackEv+0x5d)[0x55bf3dfa28b6]
/usr/local/bin/rr(_ZN2rr15notifying_abortEv+0x16)[0x55bf3dfa2815]
/usr/local/bin/rr(_ZN2rr12FatalOstreamD1Ev+0x34)[0x55bf3ddda3d6]
/usr/local/bin/rr(+0x408234)[0x55bf3de0b234]
/usr/local/bin/rr(+0x4083f8)[0x55bf3de0b3f8]
/usr/local/bin/rr(+0x40a01c)[0x55bf3de0d01c]
/usr/local/bin/rr(+0x40a58f)[0x55bf3de0d58f]
/usr/local/bin/rr(_ZN2rr12PerfCounters23default_ticks_semanticsEv+0x21)[0x55bf3de0d74f]
/usr/local/bin/rr(_ZN2rr7SessionC2Ev+0x107)[0x55bf3df2bbff]
/usr/local/bin/rr(_ZN2rr13RecordSessionC1ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERKSt6vectorIS6_SaIS6_EESD_RKNS_20DisableCPUIDFeaturesENS0_16SyscallBufferingEiNS_7BindCPUES8_PKNS_9TraceUuidEbb+0x65)[0x55bf3de27211]
/usr/local/bin/rr(_ZN2rr13RecordSession6createERKSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EESB_RKNS_20DisableCPUIDFeaturesENS0_16SyscallBufferingEhNS_7BindCPUERKS7_PKNS_9TraceUuidEbbb+0xc3d)[0x55bf3de26cdf]
/usr/local/bin/rr(+0x416872)[0x55bf3de19872]
/usr/local/bin/rr(_ZN2rr13RecordCommand3runERSt6vectorINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x40f)[0x55bf3de1a7d9]
/usr/local/bin/rr(main+0x27d)[0x55bf3dfbeb2f]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7f1a066ee0b3]
/usr/local/bin/rr(_start+0x2e)[0x55bf3dcec6de]
=== End rr backtrace
Aborted (core dumped)

This is perf_event_open(2) being disallowed by the seccomp policy.

@stale
Copy link

stale bot commented Aug 11, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Aug 11, 2022
@willdurand
Copy link

William also correctly pointed out that we might want to make sure rr actually supports AMD CPUs first:

I am thinking that we should probably make sure that rr actually supports the AMD CPU first. Ideally we would verify that we can record a trace on a host machine and then within Docker (with seccomp=unconfined).

This issue is still valid but without access to the Gitpod "hardware" (see quote above), there isn't a lot external contributors can do at the moment.

@stale stale bot removed the meta: stale This issue/PR is stale and will be closed soon label Aug 12, 2022
@stale
Copy link

stale bot commented Nov 26, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Nov 26, 2022
@GitMensch
Copy link

The CPU type was added to rr rr-debugger/rr#2872; so the biggest part is the docker configuration (and so far I've only seen an option to adjust the dockerd arguments, but not the arguments for docker run.

Please add --cap-add=SYS_PTRACE --security-opt seccomp=unconfined as documented in https://github.com/rr-debugger/rr/wiki/Docker.

@stale stale bot removed the meta: stale This issue/PR is stale and will be closed soon label Dec 5, 2022
@khuey
Copy link

khuey commented Dec 5, 2022

CAP_SYS_PTRACE probably isn't necessary these days.

@GitMensch
Copy link

CAP_SYS_PTRACE probably isn't necessary these days.

Can you please retest and adjust the rr wiki?
Any insight if it is possible to adjust the security policy with an option to dockerd?

@khuey
Copy link

khuey commented Dec 6, 2022

CAP_SYS_PTRACE probably isn't necessary these days.

Can you please retest and adjust the rr wiki?

That's not really a priority for me.

Any insight if it is possible to adjust the security policy with an option to dockerd?

You can create your own seccomp profile e.g.

                {
                  "defaultAction": "SCMP_ACT_ALLOW",
                  "architectures": [
                    "SCMP_ARCH_X86_64",
                    "SCMP_ARCH_X86"
                  ]
                }

and then do dockerd --seccomp-path=/path/to/that/file.json

Somebody could actually spend the time to come up with a minimal seccomp profile for rr itself but that's a non-trivial amount of work.

@stale
Copy link

stale bot commented Mar 12, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Mar 12, 2023
@GitMensch
Copy link

The question is (@jankeromnes ?): is there any reason to not allow perf_event_open by default, for example by adding --security-opt seccomp=unconfined to docker?

@stale stale bot removed the meta: stale This issue/PR is stale and will be closed soon label Mar 12, 2023
@jankeromnes
Copy link
Contributor Author

jankeromnes commented Mar 13, 2023

@GitMensch I believe it should be safe to always allow perf_event_open by default, for example by slightly adjusting Gitpod's seccomp profile to allow this syscall in workspaces.

However, we wouldn't want to use --security-opt seccomp=unconfined for Gitpod workspaces, because this would enable all possible syscalls, some of which might harm the isolation between Gitpod workspaces.

So, instead of entirely disabling seccomp in Gitpod, we should consider all syscalls separately (for example, when they can unlock super cool use cases like rr debugging in Gitpod -- just like when we enabled gdb debugging in Gitpod) and assess their added_value / potential_risk compromise.

@GitMensch
Copy link

I'm totally fine with that.

I believe it should be safe to always allow perf_event_open by default, for example by slightly adjusting Gitpod's seccomp profile to allow this syscall in workspaces.

So... I guess this is on the schedule now?

@jankeromnes
Copy link
Contributor Author

So... I guess this is on the schedule now?

It is not yet on the schedule. For it to be, we need to lobby Gitpod's workspace team into picking up this issue (hi @kylos101! 👋 😇)

@sg-
Copy link

sg- commented May 20, 2023

Any progress on getting perf_event_open added by default? I'd love to be able to run perf on my programs inside a gitpod development container.

@stale
Copy link

stale bot commented Sep 16, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Sep 16, 2023
@GitMensch
Copy link

Sadly there is still no option to use rr or at least perf stat in gitpod containers, is there?

@jankeromnes
Copy link
Contributor Author

@GitMensch Sadly there isn't yet. However, let's keep this issue open until there is. 👍

@jankeromnes jankeromnes added meta: never-stale This issue can never become stale and removed meta: stale This issue/PR is stale and will be closed soon labels Sep 18, 2023
@GitMensch
Copy link

@jankeromnes You've previously said

I believe it should be safe to always allow perf_event_open by default, for example by slightly adjusting Gitpod's seccomp profile to allow this syscall in workspaces.

and I agree, so... Who is the one that this issue is now depending on? Is @kylos101 "from Gitpod's workspace team" the right (and possibly only) one?

If I understood this correctly this would add perf stat and friends and would at least be a start for testing rr.

@GitMensch
Copy link

From SO:

The problem is that Docker by default blocks a list of system calls, including perf_event_open, which perf relies heavily on.
Official docker reference: https://docs.docker.com/engine/security/seccomp/

Solution:

  • Download the standard seccomp (secure compute) file for docker. It's a json file.
  • Find "perf_event_open", it only appears once, and delete it.
  • Add a new entry in syscalls section:
    { "names": [ "perf_event_open" ], "action": "SCMP_ACT_ALLOW" },
  • Add the following to your command to run the container: --security-opt seccomp=path/to/default.json

This possibly is not enough for rr, but should be the necessary start to at least run perf.

@GitMensch
Copy link

@jankeromnes Can you please try the steps outlined above for adding minimal perf counter support to GitPod?
This missing feature is the main reason for me to not develop on GitPod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta: never-stale This issue can never become stale team: workspace Issue belongs to the Workspace team type: feature request New feature or request
Projects
No open projects
Status: No status
Development

No branches or pull requests

5 participants