Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: early hostkey generation #5728

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions cloudinit/cmd/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@
from cloudinit.config.schema import validate_cloudconfig_schema
from cloudinit import log
from cloudinit.reporting import events
from cloudinit.settings import PER_INSTANCE, PER_ALWAYS, PER_ONCE, CLOUD_CONFIG
from cloudinit.ssh_util import start_early_generate_host_keys
from cloudinit.settings import (
PER_INSTANCE,
PER_ALWAYS,
Expand Down Expand Up @@ -351,6 +353,7 @@ def main_init(name, args):
# 2. Setup logging/output redirections with resultant config (if any)
# 3. Initialize the cloud-init filesystem
# 4. Check if we can stop early by looking for various files
# 4.1 Early SSH host key generation
# 5. Fetch the datasource
# 6. Connect to the current instance location + update the cache
# 7. Consume the userdata (handlers get activated here)
Expand Down Expand Up @@ -408,6 +411,15 @@ def main_init(name, args):
purge_cache_on_python_version_change(init)
mode = sources.DSMODE_LOCAL if args.local else sources.DSMODE_NETWORK

# Stage 4.1
if mode == sources.DSMODE_LOCAL:
try:
# Default should be patched to False on backport
if init.cfg.get("early_generate_host_keys", True):
start_early_generate_host_keys(init.paths.run_dir)
except Exception as e:
LOG.warning("Failed to generate host keys: %s", e)

if mode == sources.DSMODE_NETWORK:
existing = "trust"
sys.stderr.write("%s\n" % (netinfo.debug_info()))
Expand Down
82 changes: 54 additions & 28 deletions cloudinit/config/cc_ssh.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,9 @@
import logging
import os
import re
import shutil
import sys
from typing import List, Optional, Sequence
from typing import Iterable, List, Optional, Sequence

from cloudinit import lifecycle, ssh_util, subp, util
from cloudinit.cloud import Cloud
Expand All @@ -35,25 +36,28 @@

LOG = logging.getLogger(__name__)

GENERATE_KEY_NAMES = ["rsa", "ecdsa", "ed25519"]

FIPS_UNSUPPORTED_KEY_NAMES = ["ed25519"]

pattern_unsupported_config_keys = re.compile(
"^(ecdsa-sk|ed25519-sk)_(private|public|certificate)$"
)
KEY_FILE_TPL = "/etc/ssh/ssh_host_%s_key"

PUBLISH_HOST_KEYS = True
# By default publish all supported hostkey types.
HOST_KEY_PUBLISH_BLACKLIST: List[str] = []

CONFIG_KEY_TO_FILE = {}
PRIV_TO_PUB = {}
for k in GENERATE_KEY_NAMES:
for k in ssh_util.GENERATE_KEY_NAMES:
CONFIG_KEY_TO_FILE.update(
{
f"{k}_private": (KEY_FILE_TPL % k, 0o600),
f"{k}_public": (f"{KEY_FILE_TPL % k}.pub", 0o644),
f"{k}_certificate": (f"{KEY_FILE_TPL % k}-cert.pub", 0o644),
f"{k}_private": (ssh_util.KEY_FILE_TPL % k, 0o600),
f"{k}_public": (f"{ssh_util.KEY_FILE_TPL % k}.pub", 0o644),
f"{k}_certificate": (
f"{ssh_util.KEY_FILE_TPL % k}-cert.pub",
0o644,
),
}
)
PRIV_TO_PUB[f"{k}_private"] = f"{k}_public"
Expand Down Expand Up @@ -97,6 +101,31 @@ def set_redhat_keyfile_perms(keyfile: str) -> None:
os.chmod(f"{keyfile}.pub", permissions_public)


def _fetch_early_keys(
key_names: Iterable[str], rundir: str, cfg: Config
) -> List[str]:
early_keys: List[ssh_util.KeyPair] = (
ssh_util.wait_for_early_generated_keys(rundir)
)
if not early_keys or cfg.get("seed_random"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem with this that needs to be addressed.

Cloud-init's Azure and Openstack code has automatic entropy seeding, which bypasses cloud-config. This change highlights the fact that cloud-init implements datasource-specific code in configuration modules and in datasource modules. Simply checking the merged configuration is insufficient, because for some reason when this was implemented it was decided that this shouldn't just be transformed into vendor-data to be merged (overwritten by user-data).

I strongly suspect that automatic entropy seeding exists on these platforms for legacy reasons only[1][2], and is no longer needed. However, until this tech debt has been resolved, cloud-init should still respect the entropy provided by these platforms.

[1] torvalds/linux@f2580a9
[2] https://bugs.launchpad.net/nova/+bug/1789868

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: I realized this should be random_seed rather than seed_random. Having the key opposite order from the module name is confusing.

To your point, I was aware of datasources doing their own thing here. I saw random_seed in the datasources and thought it was using the same key. I didn't realize it's storing the seed under a separate metadata key. This if statement can just be updated to also check if random_seed is in cloud.datasource.metadata, correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if statement can just be updated to also check if random_seed is in cloud.datasource.metadata, correct?

Yes, I believe so

return []
for keypair in early_keys:
if keypair.key_type in key_names:
priv_file = str(keypair.private_path)
pub_file = str(keypair.public_path)
LOG.debug(
"Using early generated key for %s from %s",
keypair.key_type,
priv_file,
)
shutil.move(priv_file, ssh_util.KEY_FILE_TPL % (keypair.key_type))
shutil.move(
pub_file,
f"{ssh_util.KEY_FILE_TPL % (keypair.key_type)}.pub",
)
return [key.key_type for key in early_keys]


def handle(name: str, cfg: Config, cloud: Cloud, args: list) -> None:

# remove the static keys from the pristine image
Expand Down Expand Up @@ -155,31 +184,28 @@ def handle(name: str, cfg: Config, cloud: Cloud, args: list) -> None:
)
else:
# if not, generate them
genkeys = util.get_cfg_option_list(
cfg, "ssh_genkeytypes", GENERATE_KEY_NAMES
genkeys: List[str] = util.get_cfg_option_list(
cfg, "ssh_genkeytypes", ssh_util.GENERATE_KEY_NAMES
)
key_names = set(genkeys)

# remove keys that are not supported in fips mode if its enabled
key_names = (
genkeys
if not util.fips_enabled()
else [
names
for names in genkeys
if names not in FIPS_UNSUPPORTED_KEY_NAMES
]
)
skipped_keys = set(genkeys).difference(key_names)
if skipped_keys:
LOG.debug(
"skipping keys that are not supported in fips mode: %s",
",".join(skipped_keys),
)
if util.fips_enabled():
key_names = key_names.difference(FIPS_UNSUPPORTED_KEY_NAMES)
skipped_keys = set(genkeys).difference(key_names)
if skipped_keys:
LOG.debug(
"skipping keys that are not supported in fips mode: %s",
",".join(skipped_keys),
)

for keytype in key_names:
keyfile = KEY_FILE_TPL % (keytype)
util.ensure_dir("/etc/ssh")
early_keys = _fetch_early_keys(key_names, cloud.paths.run_dir, cfg)
remaining_keys = key_names.difference(early_keys)
for keytype in remaining_keys:
keyfile = ssh_util.KEY_FILE_TPL % (keytype)
if os.path.exists(keyfile):
continue
util.ensure_dir(os.path.dirname(keyfile))
cmd = ["ssh-keygen", "-t", keytype, "-N", "", "-f", keyfile]

# TODO(harlowja): Is this guard needed?
Expand Down Expand Up @@ -279,7 +305,7 @@ def get_public_host_keys(blacklist: Optional[Sequence[str]] = None):
@returns: List of keys, each formatted as a two-element tuple.
e.g. [('ssh-rsa', 'AAAAB3Nz...'), ('ssh-ed25519', 'AAAAC3Nx...')]
"""
public_key_file_tmpl = "%s.pub" % (KEY_FILE_TPL,)
public_key_file_tmpl = "%s.pub" % (ssh_util.KEY_FILE_TPL,)
key_list = []
blacklist_files = []
if blacklist:
Expand Down
133 changes: 132 additions & 1 deletion cloudinit/ssh_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,12 @@

import logging
import os
import pathlib
import pwd
import subprocess
from contextlib import suppress
from typing import List, Sequence, Tuple
from multiprocessing import Process
from typing import List, NamedTuple, Sequence, Tuple

from cloudinit import lifecycle, subp, util

Expand Down Expand Up @@ -64,6 +67,21 @@
"exit " + str(_DISABLE_USER_SSH_EXIT) + '"'
)

GENERATE_KEY_NAMES = ["rsa", "ecdsa", "ed25519"]

KEY_NAME_TPL = "ssh_host_%s_key"
KEY_FILE_TPL = f"/etc/ssh/{KEY_NAME_TPL}"


def get_early_host_key_dir(rundir: str):
return pathlib.Path(rundir, "tmp_host_keys")


class KeyPair(NamedTuple):
key_type: str
private_path: pathlib.Path
public_path: pathlib.Path


class AuthKeyLine:
def __init__(
Expand Down Expand Up @@ -683,3 +701,116 @@ def get_opensshd_upstream_version():
return upstream_version
except (ValueError, TypeError):
LOG.warning("Could not parse sshd version: %s", upstream_version)


def _get_early_key_fifo_path(rundir: str) -> pathlib.Path:
return pathlib.Path(rundir, "ssh-keygen-finished")


def _write_and_close(path: pathlib.Path, data: bytes) -> None:
path.write_bytes(data)
path.unlink()


def _early_generate_host_keys_body(
rundir: str, early_key_fifo_path: pathlib.Path
) -> None:
key_dir = get_early_host_key_dir(rundir)
key_dir.mkdir(mode=0o600, exist_ok=False)

for key_type in GENERATE_KEY_NAMES:
path = key_dir / (KEY_NAME_TPL % key_type)
stdout_path = path.with_suffix(".stdout")
stderr_path = path.with_suffix(".stderr")
processes = []
with open(stdout_path, "w") as stdout, open(
stderr_path, "w"
) as stderr:
try:
# Using subprocess.Popen instead of subp.subp to run
# multiple ssh-keygen commands in parallel.
p = subprocess.Popen(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why generate these in parallel? RSA keygen is time-consuming, but generating the other keys is not, so I'm not convinced that the extra complexity outweighs the benefit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really seem any more complex to me, but serial is fine too.

[
"ssh-keygen",
"-t",
key_type,
"-N",
"",
"-f",
path,
],
stdout=stdout,
stderr=stderr,
)
processes.append(p)
except Exception as e:
LOG.warning("Failed to generate %s host key: %s", key_type, e)
for process in processes:
if process.wait() != 0:
_write_and_close(early_key_fifo_path, b"failed")
return
_write_and_close(early_key_fifo_path, b"done")


def _early_generate_host_keys(
rundir: str, early_key_fifo_path: pathlib.Path
) -> None:
try:
_early_generate_host_keys_body(rundir, early_key_fifo_path)
except Exception as e:
LOG.warning("Failed to generate host keys: %s", e)
_write_and_close(early_key_fifo_path, b"failed")


def start_early_generate_host_keys(rundir: str):
if all(
pathlib.Path(KEY_FILE_TPL % key).exists() for key in GENERATE_KEY_NAMES
):
LOG.debug(
"Existing host keys present; skipping early host key generation"
)
return
early_key_fifo_path = _get_early_key_fifo_path(rundir)
early_key_fifo_path.parent.mkdir(mode=0o700, exist_ok=True)
os.mkfifo(early_key_fifo_path)
try:
Process(
target=_early_generate_host_keys,
args=(rundir, early_key_fifo_path),
daemon=True,
).start()
except Exception as e:
LOG.warning("Failed to start early host key generation: %s", e)
early_key_fifo_path.unlink()


def wait_for_early_generated_keys(rundir: str) -> List[KeyPair]:
early_key_fifo_path = _get_early_key_fifo_path(rundir)
if not early_key_fifo_path.exists():
return []
if early_key_fifo_path.read_bytes() != b"done":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not complete, this will block. It would be good to know if this happens, such as by using performance.Timed or performance.timed()

LOG.warning("Failed to retrieve early generated host keys")
return []

key_dir = get_early_host_key_dir(rundir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why not just use early_key_fifo_path here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I forgot to update the name (shouldn't end with _dir), but the reason for the function is needing to pass the rundir and to make sure the same path is used in multiple places.

keys = []
for key_type in GENERATE_KEY_NAMES:
private_path = key_dir / (KEY_NAME_TPL % key_type)
public_path = private_path.with_suffix(".pub")
if private_path.exists() and public_path.exists():
keys.append(KeyPair(key_type, private_path, public_path))
else:
stdout = ""
stderr = ""
with suppress(FileNotFoundError):
stdout = util.load_text_file(public_path / ".stdout")
with suppress(FileNotFoundError):
stderr = util.load_text_file(private_path / ".stderr")
LOG.warning(
"Failed to find generated host key pair for %s. "
"Stdout: %s. Stderr: %s",
key_type,
stdout,
stderr,
)
return keys
Loading