Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Few fixes that allow auraed to run as true PID 1 #447

Merged
merged 4 commits into from
Mar 19, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 34 additions & 23 deletions auraed/src/init/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -161,28 +161,39 @@ impl Context {
//
// Source: https://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html
fn in_new_cgroup_namespace() -> bool {
let file =
File::open("/proc/self/cgroup").expect("opening /proc/self/cgroup");
let mut reader = BufReader::new(file);
let mut contents = String::new();
let _ = reader
.read_to_string(&mut contents)
.expect("reading /proc/self/cgroup");
let file = File::open("/proc/self/cgroup");

// Here we examine the last few bytes of /proc/self/cgroup
// We know if the cgroup string ends with a \n newline
// as well as a / as in "0::/" we are in a new (and nested)
// cgroup namespace.
//
// For all intents and purposes this is the closest way we
// can guarantee that we are in "a container".
//
// It is important to note that Aurae cells (by default)
// will also schedule themselves in a new cgroup namespace.
// Therefore we would expect Aurae cells to also match this
// pattern.
//
contents.to_string().ends_with("_aurae\n")
// TODO Use the AURAE_SELF_IDENTIFIER const as currently defined in runtime_service.rs
// TODO Consider moving the const to a better home :)
// Note: The following is a workaround for a chicken egg problem in the init
// logic. We need to read from /proc to determine whether we're in a
// container or whether we're running as true PID 1. But if we're
// running as true PID 1, /proc wouldn't be mounted at this point as
// we only mount proc when we have determined that we _are_ running as
// true PID 1.
match file {
Ok(file) => {
let mut reader = BufReader::new(file);
let mut contents = String::new();
let _ = reader
.read_to_string(&mut contents)
.expect("reading /proc/self/cgroup");

// Here we examine the last few bytes of /proc/self/cgroup
// We know if the cgroup string ends with a \n newline
// as well as a / as in "0::/" we are in a new (and nested)
// cgroup namespace.
//
// For all intents and purposes this is the closest way we
// can guarantee that we are in "a container".
//
// It is important to note that Aurae cells (by default)
// will also schedule themselves in a new cgroup namespace.
// Therefore we would expect Aurae cells to also match this
// pattern.
//
contents.to_string().ends_with("_aurae\n")
// TODO Use the AURAE_SELF_IDENTIFIER const as currently defined in runtime_service.rs
// TODO Consider moving the const to a better home :)
}
Err(_) => false,
}
}
19 changes: 13 additions & 6 deletions auraed/src/init/network/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -144,23 +144,30 @@ async fn configure_loopback(handle: &Handle) -> Result<(), NetworkError> {
async fn configure_nic(handle: &Handle) -> Result<(), NetworkError> {
const DEFAULT_NET_DEV: &str = "eth0";
const DEFAULT_NET_DEV_IPV6: &str = "fe80::2";
const DEFAULT_NET_DEV_IPV6_GATEWAY: &str = "fe80::1";
const DEFAULT_NET_DEV_IPV6_SUBNET: &str = "/64";

trace!("configure {DEFAULT_NET_DEV}");

let ipv6 = format!("{DEFAULT_NET_DEV_IPV6}{DEFAULT_NET_DEV_IPV6_SUBNET}")
let ipv6_addr =
format!("{DEFAULT_NET_DEV_IPV6}{DEFAULT_NET_DEV_IPV6_SUBNET}")
.parse::<Ipv6Network>()
.expect("valid ipv6 address");

let gateway = DEFAULT_NET_DEV_IPV6_GATEWAY
.to_string()
.parse::<Ipv6Network>()
.expect("valid ipv6 address");
.expect("gateway");

add_address(handle, DEFAULT_NET_DEV.to_owned(), ipv6).await?;
add_address(handle, DEFAULT_NET_DEV.to_owned(), ipv6_addr).await?;

set_link_up(handle, DEFAULT_NET_DEV.to_owned()).await?;

add_route_v6(
handle,
DEFAULT_NET_DEV.to_owned(),
ipv6,
"::/0".parse::<Ipv6Network>().expect("valid ipv6 address"),
gateway,
)
.await?;

Expand Down Expand Up @@ -293,9 +300,9 @@ async fn add_route_v6(
.route()
.add()
.v6()
.destination_prefix(dest.ip(), dest.prefix())
.source_prefix(source.ip(), source.prefix())
.gateway(dest.ip())
.output_interface(link_index)
.pref_source(source.ip())
.execute()
.await
.map_err(|e| NetworkError::ErrorAddingRoute {
Expand Down
16 changes: 10 additions & 6 deletions auraed/src/init/system_runtimes/pid1_system_runtime.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ use tonic::async_trait;
use tracing::{error, info, trace};

const POWER_BUTTON_DEVICE: &str = "/dev/input/event0";
const DEFAULT_NETWORK_SOCKET_ADDR: &str = "[::1]:8080";
const DEFAULT_NETWORK_SOCKET_ADDR: &str = "[::]:8080";

pub(crate) struct Pid1SystemRuntime;

Expand Down Expand Up @@ -87,9 +87,6 @@ impl SystemRuntime for Pid1SystemRuntime {
// TODO We likely to do not need to mount these filesystems.
// TODO Do we want to have a way to "try" these mounts and continue without erroring?

MountSpec { source: None, target: "/dev", fstype: Some("devtmpfs") }
.mount()?;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was throwing a "Resource busy" so I presumed this wasn't actually needed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it just a feature of cloud-hypervisor that /dev is already mounted, but we would need to mount it in a non-vm context (i.e., true-true PID 1 for lack of a better identifier)?

Maybe the try and continue approach is right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at cloud-hypervisor's config that you linked to in #439 I see the following in the file:

CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y

Would this config be causing the mount to already be mounted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably so. "Resource busy" usually means that the device is already mounted afaik

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like we should try and mount just in case the config doesn't exist to automatically mount it (but allow the error if it fails... maybe log a warning)

Copy link
Contributor Author

@JeroenSoeters JeroenSoeters Mar 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably do that for all mounts we deem essential. I'm wondering though if we should do this now or if we can postpone this to later when we allow for kernels other than the ones we control - if we ever are planning to let users provide their own kernel this at all, as this introduces a whole set of other challenges. Until that point we know with 100% certainty what is mounted and what is not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets postpone until later once we have better control of the kernels we are running.


MountSpec { source: None, target: "/sys", fstype: Some("sysfs") }
.mount()?;

Expand All @@ -100,8 +97,15 @@ impl SystemRuntime for Pid1SystemRuntime {
}
.mount()?;

trace!("configure network");
//show_dir("/sys/class/net/", false); // Show available network interfaces
MountSpec {
source: Some("debugfs"),
target: "/sys/kernel/debug",
fstype: Some("debugfs"),
}
.mount()?;

trace!("Configure network");
// show_dir("/sys/class/net/", false); // Show available network interfaces
let network = network::Network::connect()?;
network.init().await?;
network.show_network_info().await;
Expand Down